Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update proteinortho v6.2.3 and some changes to the xml #5184

Merged
merged 27 commits into from
Jun 16, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 5 additions & 6 deletions tools/proteinortho/proteinortho.xml
Original file line number Diff line number Diff line change
Expand Up @@ -100,11 +100,11 @@
<option value="blatp">BLAT (aminoacid sequences)</option>
<option value="blatn">BLAT (nucleotide sequences)</option>
</param>
<param argument="--evalue" type="float" value="0.001" min="0" label="E-value threshold of the blast algorithm" help="This is the main parameter for the generation of the reciprocal best hit graph. Larger values results in more false positives (connections between proteins)."/>
<param argument="--conn" type="float" value="0.1" min="0." max="10." label="Minimal algebraic connectivity" help="This is the main parameter for the clustering step. Choose larger values then more splits are done, resulting in more and smaller clusters."/>
<param argument="--sim" type="integer" value="95" min="0" max="100" label="Minimal reciprocal similarity in %" help="This and --evalue are main parameters for the generation of the reciprocal best hit graph. 1 = only the best reciprocal hits are reported, 0 = all possible reciprocal blast matches (within the E-value cutoff) are reported."/>
<param argument="--conn" type="float" value="0.1" min="0." max="1." label="Minimal algebraic connectivity" help="This is the main parameter for the clustering step. Choose larger values then more splits are done, resulting in more and smaller clusters. A value of 0 corresponds to no clustering."/>
<section name="more_options" title="Additional Options" expanded="False">
<param argument="--evalue" type="float" value="0.001" min="0" label="E-value threshold of the blast algorithm" help="Larger values results in more false positives (connections between proteins)."/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency I would use $more_options.evalue in the command block (even if $evalue should work as well).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ok, so since these options are nested in the more_options sections, the env var needs this too, I will add this to the xml

Copy link
Contributor Author

@pmjklemm pmjklemm Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the test also need this? So e.g. <param name="evalue" value="1"/> needs to be <param name="more_options.evalue" value="1"/> ?
I just saw that diamond wraps the nested params in the same section tags, so this is probably the solution for this problem here

<param argument="--cov" type="integer" value="50" min="0" max="100" label="Minimal coverage of best blast alignments in %"/>
<param argument="--sim" type="integer" value="95" min="0" max="100" label="Minimal sequence similarity in %"/>
<param argument="--identity" type="integer" value="25" min="0" max="100" label="Minimal percent identity of best blast hits in %"/>
<param argument="--selfblast" type="boolean" checked="false" truevalue="--selfblast" falsevalue="" label="Apply selfblast, detects paralogs without orthologs "/>
<param argument="--singles" type="boolean" checked="false" truevalue="--singles" falsevalue="" label="Report singleton genes without any hit "/>
Expand All @@ -124,7 +124,7 @@
<when value="specified">
<param argument="--dups" type="integer" value="0" min="0" max="100" label="Number of reiterations for adjacencies heuristic, to determine duplicated regions"/>
<param argument="--cs" type="integer" value="3" min="0" max="100" label="Size of a maximum common substring (MCS) for adjacency matches"/>
<param argument="--alpha" type="float" value="0.5" min="0." max="1." label="Minimal percent identity of best blast hits"/>
<param argument="--alpha" type="float" value="0.5" min="0." max="1." label="Weight of adjacencies vs. sequence similarity" help="alpha[FF-adj score] + (1−alpha)[BLAST score]"/>
<param name="input_files_syn" type="data" format="gff" multiple="true" min="2" label="Select the GFF3 files matching the input fasta files" help="The GFF3 files need matching names with the input fasta files. If you provide mybacteria123.faa or mybacteria123.fasta ... then you need to provide mybacteria123.gff here accoringly. The attributes column (#9) must contain the attribute Name=GENE IDENTIFIER where GENE IDENTIFIER corresponds to the respective (protein) identifier in the FASTA input. For example see https://gitlab.com/paulklemm_PHD/proteinortho/-/blob/master/test/C.gff"/>
</when>
</conditional>
Expand All @@ -144,7 +144,6 @@
</test>
<test expect_num_outputs="3"> <!-- various parameter -->
<param name="input_files" value="L.fasta,C.fasta,C2.fasta,E.fasta,M.fasta"/>
<param name="evalue" value="1"/>
<param name="conn" value="1"/>
<param name="cov" value="42"/>
<param name="sim" value="42"/>
Expand Down Expand Up @@ -279,5 +278,5 @@ Proteinortho is a tool to detect orthologous proteins/genes within different spe
More information can be found on github https://gitlab.com/paulklemm_PHD/proteinortho
]]>
</help>
<expand macro="citations"/>
<expand macro="citations" />
</tool>
6 changes: 3 additions & 3 deletions tools/proteinortho/proteinortho_macros.xml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<?xml version="1.0"?>
<macros>
<token name="@TOOL_VERSION@">6.1.5</token>
<token name="@WRAPPER_VERSION@">0</token>
<token name="@TOOL_VERSION@">6.2.0</token>
<token name="@WRAPPER_VERSION@">1</token>
<token name="@PROFILE@">20.09</token>
<xml name="citations">
<citations>
Expand All @@ -17,7 +17,7 @@
<requirement type="package" version="2.0.15">diamond</requirement>
<requirement type="package" version="2.13.0">blast</requirement>
<requirement type="package" version="377">ucsc-blat</requirement>
<requirement type="package" version="1418">last</requirement>
<requirement type="package" version="1422">last</requirement>
</requirements>
</xml>
<xml name="version_command">
Expand Down