From 55a72ad9bd3f81fa72457675bc82ee4efbb8ca91 Mon Sep 17 00:00:00 2001 From: Cole Lyman Date: Thu, 8 Aug 2024 13:11:08 -0600 Subject: [PATCH] Replace zcat (#94) * D3-Enhancements (#78) * Sam/try plots (#71) * Fix batch mode pandas warning. (#70) * refactor to call method on DataFrame, rather than Series. Removes warning. * Fix pandas future warning in CRISPRessoWGS --------- Co-authored-by: Cole Lyman * Functional * Cole/fix status file name (#69) * Update config file logging messages This removes printing the exception (which is essentially a duplicate), and adds a condition if no config file was provided. Also changes `json` to `config` so that it is more clear. * Fix divide by zero when no amplicons are present in Batch mode * Don't append file_prefix to status file name * Place status files in output directories * Update tests branch for file_prefix addition * Load D3 and plotly figures with pro with multiple amplicons * Update batch * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix Before this fix, when using a file_prefix the second run that was compared would not be displayed as a data in the first figure of the report. * Import CRISPRessoPro instead of importing the version When installed via conda, the version is not available * Remove `get_amplicon_output` unused function from CRISPRessoCompare Also remove unused argparse import * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests * Allow for matching of multiple guides in the same amplicon * Fix pandas FutureWarning * Change test branch back to master --------- Co-authored-by: Sam * Try catch all futures * Fix test fail plots * Point test to try-plots * Fix d3 not showing and plotly mixing with matplotlib * Use logger for warnings and debug statements * Point tests back at master --------- Co-authored-by: mbowcut2 <55161542+mbowcut2@users.noreply.github.com> Co-authored-by: Cole Lyman * Sam/fix plots (#72) * Fix batch mode pandas warning. (#70) * refactor to call method on DataFrame, rather than Series. Removes warning. * Fix pandas future warning in CRISPRessoWGS --------- Co-authored-by: Cole Lyman * Functional * Cole/fix status file name (#69) * Update config file logging messages This removes printing the exception (which is essentially a duplicate), and adds a condition if no config file was provided. Also changes `json` to `config` so that it is more clear. * Fix divide by zero when no amplicons are present in Batch mode * Don't append file_prefix to status file name * Place status files in output directories * Update tests branch for file_prefix addition * Load D3 and plotly figures with pro with multiple amplicons * Update batch * Fix bug in CRISPRessoCompare with pointing to report datas with file_prefix Before this fix, when using a file_prefix the second run that was compared would not be displayed as a data in the first figure of the report. * Import CRISPRessoPro instead of importing the version When installed via conda, the version is not available * Remove `get_amplicon_output` unused function from CRISPRessoCompare Also remove unused argparse import * Implement `get_matching_allele_files` in CRISPRessoCompare and accompanying unit tests * Allow for matching of multiple guides in the same amplicon * Fix pandas FutureWarning * Change test branch back to master --------- Co-authored-by: Sam * Try catch all futures * Fix test fail plots * Fix d3 not showing and plotly mixing with matplotlib --------- Co-authored-by: mbowcut2 <55161542+mbowcut2@users.noreply.github.com> Co-authored-by: Cole Lyman * Remove token from integration tests file * Provide sgRNA_sequences to plot_nucleotide_quilt plots * Passing sgRNA_sequences to plot * Refactor check for determining when to use CRISPREssoPro or matplotlib for Batch plots * Add max-height to Batch report samples * Change testing branch * Fix wrong check for large Batch plots * Fix typo and move flexiguide to debug (#77) * Change flexiguide output to debug level * Fix typo in fastp merged output file name * Adding id tags for d3 script enhancements * pointing to test branch * Add amplicon_name parameter to allele heatmap and line plots * Add function to extract quantification window regions from include_idxs * Scale the quantification window according to the coordinates of the sgRNA plot * added c2pro check, added space in args.json * Correct the quantification window indexes for multiple guides * Fix name of nucleotide conversion plot when guides are not the same * Fix jinja variables that aren't found * Fix multiple guide errors where the wrong sgRNA sequence was associated in d3 plot * Remove unneeded variable and extra whitespace * Switch test branch to master --------- Co-authored-by: Samuel Nichols Co-authored-by: mbowcut2 <55161542+mbowcut2@users.noreply.github.com> Co-authored-by: Cole Lyman * Replace zcat with gunzip -c in `get_most_frequent_reads` --------- Co-authored-by: Trevor Martin <60452953+trevormartinj7@users.noreply.github.com> Co-authored-by: Samuel Nichols Co-authored-by: mbowcut2 <55161542+mbowcut2@users.noreply.github.com> --- CRISPResso2/CRISPRessoShared.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/CRISPResso2/CRISPRessoShared.py b/CRISPResso2/CRISPRessoShared.py index fc8e8ded..0ef79313 100644 --- a/CRISPResso2/CRISPRessoShared.py +++ b/CRISPResso2/CRISPRessoShared.py @@ -782,13 +782,13 @@ def get_most_frequent_reads(fastq_r1, fastq_r2, number_of_reads_to_consider, fas view_cmd_1 = 'cat' if fastq_r1.endswith('.gz'): - view_cmd_1 = 'zcat' + view_cmd_1 = 'gunzip -c' file_generation_command = "%s %s | head -n %d " % (view_cmd_1, fastq_r1, number_of_reads_to_consider * 4) if fastq_r2: view_cmd_2 = 'cat' if fastq_r2.endswith('.gz'): - view_cmd_2 = 'zcat' + view_cmd_2 = 'gunzip -c' min_overlap_param = "" if min_paired_end_reads_overlap: min_overlap_param = "--overlap_len_require {0}".format(min_paired_end_reads_overlap)