-
Notifications
You must be signed in to change notification settings - Fork 734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update bcftools pluginsplit #7013
update bcftools pluginsplit #7013
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand the first issue, do you have an input file with the same name as one of the samples in the file?
Why do you need to have suffixes?
This issue happens in our phaseimpute pipeline. In some case the user give a single vcf, that might be named as
The suffix is used to correctly name the file for multiQC in our pipeline. It also helps to filter out and get back the meta information when you transpose the channel after the split operation. |
Okay, thanks for the clarification. Could you work with |
This might work in my use case, I will try. |
Unfortunately it doesn't work in my use case, as I need the sample names to be first, then a suffix and the extension last... if [ -n "${suffix}" ]; then
for file in ${prefix}/*; do
# Extract the basename
base_name=\$(basename "\$file")
# Extract the extension
extension=""
# Remove the extension if it exists
if [[ "\$base_name" =~ \\.(vcf|bcf)(\\.gz)?(\\.tbi|\\.csi)?\$ ]]; then
extension="\${BASH_REMATCH[0]}"
base_name="\${base_name%\$extension}"
fi
# Construct the new name
new_name="\${base_name}${suffix}\${extension}"
mv "\$file" "${prefix}/\$new_name"
done
fi |
This seems a lot more complicated than
Would it not more clean to adapt the logic in phaseimpute? Perhaps the passing the suffix in the meta? Or adding a rename module, e.g. https://github.com/nf-core/raredisease/blob/master/modules/local/rename_align_files.nf? |
Renaming the file would add a new step, but it would be cleaner for the nf-core repository. |
@fellen31 thanks a lot for the help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, while renaming would be an extra step, I think perhaps it's worth it to keep this module a bit simpler.
Co-authored-by: Felix Lenner <[email protected]>
@fellen31 is it better now ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. You are now making the files both inside an output directory and you are renaming them with a prefix?
To summarise, the module takes one vcf file and makes multiple, but sometimes the output file might have the same name as the input? At which point, why don't you stage the input file into a folder, then make all the output files in the work directory?
No they were staying in the outputDir.
This is an excellent suggestion ! I didn't though of it... |
Do you still need to have a prefix/suffix to in the output files in this module? The prefix is usually |
I don't think if we keep both input and output in separate folder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, looks good to me then!
PR checklist
Here is a PR to update the
BCFTOOLS_PLUGINSPLIT
process.The issue solved here was:
The process didn't work in the case of a single sample present in the file with an equivalent output name -> solved by letting the file in the
$prefix
folderNo possibility to add a suffix to the output files generated ->
ext.suffix
argument addedThis comment contains a description of changes (with reason).
If you've fixed a bug or added code that should be tested, add tests!
If you've added a new tool - have you followed the module conventions in the contribution docs
If necessary, include test data in your PR.
Remove all TODO statements.
Emit the
versions.yml
file.Follow the naming conventions.
Follow the parameters requirements.
Follow the input/output options guidelines.
Add a resource
label
Use BioConda and BioContainers if possible to fulfil software requirements.
Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
nf-core modules test <MODULE> --profile docker
nf-core modules test <MODULE> --profile singularity
nf-core modules test <MODULE> --profile conda
nf-core subworkflows test <SUBWORKFLOW> --profile docker
nf-core subworkflows test <SUBWORKFLOW> --profile singularity
nf-core subworkflows test <SUBWORKFLOW> --profile conda