Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DSL2: genotyping #1016

Merged
merged 121 commits into from
Mar 20, 2024
Merged

DSL2: genotyping #1016

merged 121 commits into from
Mar 20, 2024

Conversation

TCLamnidis
Copy link
Collaborator

@TCLamnidis TCLamnidis commented Jul 21, 2023

TODOs:

  • Needs implementation of gatk_dbsnp (Will need tweaking once dbsnp gets a meta)
  • Pass parameters to modules
  • Add gatk_dnsnp to single ref to be able to test at least.
  • Fix schema
  • genotyping_gatk_dbsnp needs to fit a regex pattern (*.vcf no gz)
  • Add warning when no snp/bed file but pileupCaller requested.
  • Add warning if only one of snpfile or bedfile are provided.
  • ADD TESTS!
  • Add EIGENSTRATSNPCOVERAGE and pass to MQC
  • Pass bcftools stats to MQC
  • Merge DSL2: Add rg info to BWA_MEM #1046 so tests stop failing.

Progress:

  • GATK UG
  • GATK HC
  • PileupCaller
  • angsd
  • freebayes

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
    • If you've added a new tool - add to the software_versions process and a regex to scrape_software_versions.py
    • If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/eager/tree/master/.github/CONTRIBUTING.md)
    • If necessary, also make a PR on the nf-core/eager branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint .).
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@TCLamnidis TCLamnidis marked this pull request as draft July 21, 2023 14:32
@github-actions
Copy link

github-actions bot commented Aug 4, 2023

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit d39a10a

+| ✅ 246 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗  22 tests had warnings |!

❗ Test warnings:

  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in main.nf: Remove this line if you don't need a FASTA file
  • pipeline_todos - TODO string in nextflow.config: Specify your pipeline's command line flags
  • pipeline_todos - TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core
  • pipeline_todos - TODO string in README.md: Fill in short bullet-pointed list of the default steps in the pipeline
  • pipeline_todos - TODO string in ci.yml: You can customise CI pipeline run tests as required
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
  • pipeline_todos - TODO string in test.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in test_humanbam.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test_humanbam.config: Give any required params for the test so that command line flags are not needed
  • schema_description - No description provided in schema for parameter: skip_qualimap
  • schema_description - No description provided in schema for parameter: skip_damagecalculation

❔ Tests ignored:

  • nextflow_config - Config default ignored: params.contamination_estimation_angsd_hapmap

✅ Tests passed:

Run details

  • nf-core/tools version 2.13.1
  • Run at 2024-03-19 15:44:26

@TCLamnidis
Copy link
Collaborator Author

TCLamnidis commented Aug 4, 2023

TODOs:

  • Updating bcftools modules on nf-core/modules (need to add --exons tsv file. Started working on it locally)
  • Add BCFTOOLS_STATS

@TCLamnidis
Copy link
Collaborator Author

TCLamnidis commented Mar 8, 2024

TODOs:

  • Add genotyper used in meta of output files (in case it is needed downstream).
  • GATK_UG module needs updating to zip files with bgzip, allowing indexing with BCFTOOLS.
  • Add bed to Freebayes? Still not sure abt this. would like to discuss.

@TCLamnidis
Copy link
Collaborator Author

This needs rereview. I did not address the Freebayed BED file issue. We will not implement it now (as it would be a new feature anyway), and only implement it if it is requested.

Copy link

@jbv2 jbv2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me.
There are some missing To Dos and CITATIONS.md is still missing.

docs/development/manual_tests.md Show resolved Hide resolved
@TCLamnidis
Copy link
Collaborator Author

  • Need to readd parameter combination validations with the new schema. Left in as comments during merge conflict resolution.

Copy link
Contributor

@scarlhoff scarlhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, huge effort 💪

@TCLamnidis
Copy link
Collaborator Author

TCLamnidis commented Mar 19, 2024

GATK4_HAPLOTYPECALLER now fails because the input BAM has a different sample name in its RG than produced by the MAP SWF.

Updating test-datasets to fix this. nf-core/test-datasets#1141

@TCLamnidis TCLamnidis merged commit fd6fa52 into dev Mar 20, 2024
16 checks passed
@TCLamnidis TCLamnidis deleted the dsl2-genotyping branch March 20, 2024 10:10
@scarlhoff scarlhoff mentioned this pull request Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Combining the genotypes from single- and double-stranded samples
4 participants