Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ChewBBACA #5899

Merged
merged 100 commits into from
Apr 13, 2024
Merged

add ChewBBACA #5899

merged 100 commits into from
Apr 13, 2024

Conversation

nilchia
Copy link
Contributor

@nilchia nilchia commented Mar 21, 2024

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

@nilchia nilchia marked this pull request as draft March 21, 2024 14:41
mkdir ./input &&
mkdir ./schema &&
#for $file in $input_file
cp $file ./input/${file.element_identifier} &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please single-quote all file-path

<command detect_errors="exit_code"><![CDATA[
mkdir ./input &&
#for $file in $input_file
cp $file ./input/${file.element_identifier} &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is

Suggested change
cp $file ./input/${file.element_identifier} &&
ln -sf '$file' './input/${file.element_identifier}' &&

working instead of copying?

</test>
</tests>
<help><![CDATA[
chewBBACA version: 3.3.3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a useful help section, the parameters are already explained in the form, here you can describe the tool, the functionality and maybe where you can get the zip file from.

<expand macro="requirements" />
<command detect_errors="exit_code"><![CDATA[
mkdir ./schema &&
unzip $input_schema -d ./schema &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unzip $input_schema -d ./schema &&
unzip '$input_schema' -d ./schema &&

</xml>
<xml name="citations">
<citations>
<citation type="bibtex">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not citing here the real publication? 10.1099/mgen.0.000166

tools/chewbbaca/CreateSchema.xml Outdated Show resolved Hide resolved
<test expect_num_outputs="1">
<param name="input_file" value="CDS_Str_agalactiae.fasta"/>
<param name="cds_input" value="true"/>
<output name="schema" file="Str_agalactiae_cds.zip" compare="sim_size"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of comparing the sim_size you can check for members in the archive: https://docs.galaxyproject.org/en/latest/dev/schema.html#has-archive-member

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Yes that makes more sense. I have added it.

<assert_stdout>
<has_text text="Schema is now available at"/>
<has_text text="Finished at"/>
</assert_stdout>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here you can test the zip file content

<param argument="--common" type="boolean" truevalue="--common" falsevalue="" checked="false" label="Common" optional="true" help="Create file with profiles for the set of common loci" />
</inputs>
<outputs>
<data format="tsv" name="JoinedProfile" from_work_dir="JoinedProfile.tsv" label="${tool.name} on ${on_string}: Joined profiles"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use format="tabular" here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I updated it. But isn't it same?

<tests>
<test>
<param name="input1" value="results_alleles.tsv,results_alleles2.tsv"/>
<output name="JoinedProfile" file="JoinedProfile.tsv" compare="sim_size"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why sim_size here and not comparing by diff? Or using asserts?

Copy link
Contributor Author

@nilchia nilchia Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated it with compare="diff"

<tests>
<test>
<param name="mode" value="species" />
<output name="NSStats" file="NSStats.txt" compare="sim_size" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sim_size is not a good test, its the least stringent test that we have and we should do better whenever we can ... please use diff or asserts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in any tool that has sim_size as test

<inputs>
<param format="zip" name="input_schema" type="data" multiple="true" label="Schema Files in zip format" help="The schema directory contains the loci FASTA files and a folder named 'short' that contains the FASTA files with the loci representative alleles."/>
<section name="advanced" title="Advanced options">
<param argument="--training-file" type="data" format="binary" label="Prodigal training file" optional="true" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format binary is strange? What is this file?

<test expect_num_outputs="1">
<param name="input_schema" value="GCA_000007265.1_ASM726v1_schema_seed.zip"/>
<param name="size_filter" value="false"/>
<output name="schema" file="GCA_000007265.1_ASM726v1_PExternalschema_seed.zip" compare="sim_size"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nilchia
Copy link
Contributor Author

nilchia commented Apr 11, 2024

@bgruening the binary file is a file with ".trn" extention (called Prodigal training file). I tried to open it in VSCode and it says: The file is not displayed in the text editor because it is either binary or uses an unsupported text encoding.

I googled the file and got this:
Prodigal (PROkaryotic DynamIc programming Genefinding ALgorithm) is an open source lightweight microbial genefinding program developed at University of Tennessee and Oak Ridge National Laboratory.

the prodigal package provides 3 output formats GFF3, GenBank and Sequin table format.
But in chewBBACA it is a .trn file

@@ -8,7 +8,7 @@ Contacts: [email protected]
=======================
chewBBACA - NSStats
=======================
Started at: 2024-04-03T10:24:19
Started at: 2024-04-11T17:01:56
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updating the time will not help much, if you have time in your diffs ... you need to allow for certain lines of difference with lines_diff

<output name="outvcf" file="call-out1.vcf" lines_diff="6" />

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
I added line_diff="4" and it passed the planemo test but here it got error, I put assert_contents for the test.

@bgruening bgruening merged commit 8722918 into galaxyproject:main Apr 13, 2024
14 checks passed
@bgruening
Copy link
Member

Thanks, @nilchia for your first contribution! Great work - it was a complicated tool!

@nilchia
Copy link
Contributor Author

nilchia commented Apr 13, 2024

Hooray 😃 ,
Thank you so much, @bgruening and @pavanvidem for your invaluable help and guidance. I've learned a lot and I couldn't have done this without you both! 🌷

@nilchia nilchia deleted the chewbbaca branch April 13, 2024 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants