Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataops 906 multiqc updates #44

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

kjellinjonas
Copy link
Contributor

MultiQC updates:

removed multiqc_config_wgs_qc.yaml

  • Not used anymore

updated multiqc_config_wgs.yaml

  • removed plot (mosdepth-coverage-per-contig) since it was ugly and not informative
  • Updated formatting of General stats table
    • made >=10X coverage column visible
    • made irrelevant stats invisible
    • Grouped samples

updated multiqc_pipeline_info.py

  • Corrected input file so it is compatible with new sarek
  • Updated regex to work with the new input file

updated sample_list_for_multiqc.py

  • Changed identifier string for Sarek sample sheet
  • Made it compatible with new samplesheet format
    • Changed delimiter
    • Added step to skip header of samplesheet

updated multiqc_sarek_project.sh

  • Now requires project ID to be given in the command line to make it more flexible.
    Before the project ID was parsed from the input path which is not ideal when running
    multiqc from a non-canonical path e.g. FU projects.
    multiqc_sarek_project $PWD
  • Now calls multiqc_extra_stats_qc.py
  • Creates a new folder and moves the output from multiqc_extra_stats_qc.py there.
    If these files would be moved to the multiqc_custom_content folder, as the other custom files, the QC check would be included in the user report as well.
  • Changed paths to be compatible with new sarek output folder structure
  • Updated MultiQC call to reference the new files

new script multiqc_extra_stats_qc.py

  • Calculates Autosomal coverage, GC % and % mapped reads
  • Parse precalculated values for 10/30 X coverage, Unfiltered variants and Average insert size
  • Performs QC check according to INS-00123
  • Values in QC class should be updated if INS-00123 is updated
  • Output a yaml file with data to be added to MultiQC General stats
  • Output a yaml file which will create a QC section in MultiQC report

Copy link
Contributor

@matrulda matrulda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, I left some comments for you.

multiqc_pipeline_info.py Outdated Show resolved Hide resolved
multiqc_extra_stats_qc.py Outdated Show resolved Hide resolved
multiqc_extra_stats_qc.py Outdated Show resolved Hide resolved
multiqc_sarek_project.sh Show resolved Hide resolved
multiqc_sarek_project.sh Outdated Show resolved Hide resolved
multiqc_sarek_project.sh Outdated Show resolved Hide resolved
yaml.dump(qc_out, fout)
with open(os.path.join(analysis_dir, "extra_stats.yaml"), "w") as fout:
with open(os.path.join(outdir, "extra_stats.yaml"), "w") as fout:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something happened with the indentation here.

multiqc_sarek_project.sh Show resolved Hide resolved
Copy link
Contributor

@matrulda matrulda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just something weird with the indentation on one line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants