Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cram chunking and minimap2-based Hi-C alignments #113

Merged
merged 19 commits into from
Sep 17, 2024

Conversation

reichan1998
Copy link
Contributor

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@tkchafin tkchafin self-requested a review September 10, 2024 12:37
@muffato
Copy link
Member

muffato commented Sep 10, 2024

filter_five_end.pl was actually written by the company Arima Genomics. It hasn't been done in TreeVal, but the copyright of this file needs to correctly stated.
Could you please change the copyright statement from being on priyanka (which is false) to:

Copyright (c) 2022-2024 Genome Research Ltd.
except `bin/filter_five_end.pl`:
Copyright (c) 2017 Arima Genomics, Inc.

@tkchafin : if you want to test the feature, try Myxine_glutinosa from TOLSD-2062. We just finished running it with the version 1.2.1 of the pipeline. The resulting CRAM file is 178 GB 🙀 and contains 6.5 billion reads. bwa-mem2 took more than 7 days to run, as a single process on 64 cores and 174 GB RAM.

@tkchafin
Copy link
Contributor

Sent PR reichan1998#2 covering the Illumina case. I think interleaved fastq should be handled correctly there but would be worth a double-check! :)

Copy link
Contributor

@tkchafin tkchafin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to update the copyright and make sure the documentation is clear on interleaved fastq inputs but will send as a separate PR.
Also need to update CI to run both bwamem and minimap2 tests

@tkchafin tkchafin merged commit 51a1d9f into sanger-tol:dev Sep 17, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants