Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update shovill tool to accept fastq and fastq.gz #5476

Merged
merged 2 commits into from
Oct 12, 2023

Conversation

pimarin
Copy link
Contributor

@pimarin pimarin commented Sep 11, 2023

Shovill seem to failed when input is indicated as fast or fastq.gz while it can work with this input.
I just add the fastq as input format

@bernt-matthias
Copy link
Contributor

Why do you use fastq as format for your data and not fastqsanger or fastqillumina?

Fastq is essentially an unspecified fastq which should be used if the underlying tool can work with both phred encodings. Is this the case here?

@pimarin
Copy link
Contributor Author

pimarin commented Sep 12, 2023

The tool failed when we submitted classical fastq or fastq.gz illumina reads (while shovill command line accept this input type). If I force to change data input as fastqsanger it's okay but not good
In the wrapper only fastqsanger,fastqsanger.gz,fastqsanger.bz2 were provided, while in other assembler (for example spades) or tools which analyse fastq files, fastq and fastq.gz are also indicated.

@bernt-matthias
Copy link
Contributor

Then we should explicitly add fastqillumina and the zipped variants.

fastq would also allow fastqcssanger which is probably not desired.

@bernt-matthias
Copy link
Contributor

But I also have to add that the fact that a tool accepts both types may not be the same as that the tool correctly works with both.

Most of the time tools accepting both types either

  • ignore qualities anyway
  • have a flag to set the phred score
  • or detect internally (which is seldom I think)

@pimarin
Copy link
Contributor Author

pimarin commented Sep 12, 2023

Humm you right, Shovill explicitaly work with Illumina paired-end readsas the author said. So it's better to remove the fastqsanger and fastq like option and only add fastqillumina ? but I need to force the datatype when a fastq file is used (even if it's fastq from illumina)

@bernt-matthias
Copy link
Contributor

Humm you right,

Ironically I can't agree with this statement. If you check their test data they have fastq files with all quality values being 2 which is not possible in fastqillumina, but it is in fastqsanger. So they likely can deal with fastqsanger.

Maybe it's best to just open an issue at their repo and ask if shovill can deal with fastq data of any quality encoding.

@bgruening
Copy link
Member

They can for sure deal with fastqsanger.

@pimarin
Copy link
Contributor Author

pimarin commented Sep 25, 2023

I don't have any answer from my issue on shovill repo, so it's better to keep only fastqsanger as previously write on the wrapped. So my PR is not necessary.

@bgruening bgruening merged commit 0c2ed11 into galaxyproject:main Oct 12, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants