Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit of revoluzer tools #5878

Merged
merged 7 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
/tools/progressivemauve/ @hexylena
/tools/prokka/ @nsoranzo
/tools/raxml/ @nsoranzo
/tools/revoluzer/ @bernt-matthias
/tools/scater/ @nsoranzo
/tools/seurat/ @nsoranzo
/tools/sickle/ @nsoranzo
Expand Down
18 changes: 18 additions & 0 deletions tools/revoluzer/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: revoluzer
owner: iuc
description: revoluzer wrappers
long_description: |
Gene order analysis tools
categories:
- Phylogenetics
remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/tools/revoluzer
homepage_url: https://gitlab.com/Bernt/revoluzer/
type: unrestricted
auto_tool_repositories:
name_template: "{{ tool_id }}"
description_template: "Wrapper for revoluzer tool: {{ tool_name }}."
suite:
name: "suite_revoluzer"
description: "A suite of tools that brings the revoluzer project into Galaxy."
long_description: |
Gene order analysis tools
96 changes: 96 additions & 0 deletions tools/revoluzer/crex.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
<tool id="revoluzer_crex" name="CREx" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="20.01" license="MIT">
<description>reconstruct pairwise gene order rearrangement</description>
<macros>
<import>macros.xml</import>
</macros>
<expand macro="biotools"/>
<expand macro="requirements"/>
<version_command>crex --version</version_command>
<command detect_errors="exit_code"><![CDATA[
crex
-f '$f'
$linear
$method_cond.method_select
#if $method_cond.method_select == ""
$noalt
--wI $wI
--wiT $wiT
--wiT $wTDRL
#else if $method_cond.method_select == "--crex1"
$prinobp
#end if
> '$out'
]]></command>
<inputs>
<param argument="-f" type="data" format="fasta" label="Gene orders"/>
<param argument="--linear" type="boolean" truevalue="--linear" falsevalue="" checked="false" label="Genomes are linear"/>
<conditional name="method_cond">
<param name="method_select" type="select" label="method">
<option value="">CREx2</option>
<option value="--crex1" selected="true">CREx1</option>
<option value="--bp">compute with breakpoint scenario [ZhaoBourque07]</option>
</param>
<when value="">
<param argument="--wI" type="float" min="0" value="1" label="Weight of an inversion"/>
<param argument="--wiT" type="float" min="0" value="1" label="Weight of an inverse transposition"/>
<param argument="--wTDRL" type="float" min="0" value="1" label="Weight of a TDRL"/>
<param argument="--noalt" type="boolean" truevalue="" falsevalue="--noalt" checked="true" label="Compute alternatives for T+iT"/>
</when>
<when value="--crex1">
<param argument="--prinobp" type="boolean" truevalue="" falsevalue="--prinobp" checked="true" label="Compute breakpoint scenario for prime nodes"/>
</when>
<when value="--bp"></when>
</conditional>
</inputs>
<outputs>
<data name="out" format="tabular">
<actions>
<action name="column_names" type="metadata" default="Source gene order,Target Gene order,Rearrangement,Breakpoints" />
</actions>
</data>
</outputs>
<tests>
<test expect_num_outputs="1">
<param name="f" value="test.fas"/>
<param name="linear" value="true"/>
<output name="out">
<metadata name="column_names" value="Source gene order,Target Gene order,Rearrangement,Breakpoints" />
<assert_contents>
<has_n_lines n="7"/>
<has_n_columns n="4"/>
<has_text text="I(B C )"/>
</assert_contents>
</output>
</test>
</tests>
<help><![CDATA[

.. class:: infomark

**What it does**

Compute rearrangement scenarios on pairs of gene orders (with equal duplication free gene content).

Usage
.....

**Input**

@INPUT_FORMAT@

**Output**

Rearrangements in the 3rd column of the outbut table are listed as follows:

- I(X): the genes listed in X are inverted
- T(X ,Y ,): The order of the gene sets X and Y is transposed
- iT(X, Y, ): Same as transposition, but one gene of the sets is also inverted
- TDRL(X, Y): A tandem duplication random loss where the genes in X are kept in the 1st copy and the genes in Y an the last

]]></help>
<citations>
<citation type="doi">10.1109/TCBB.2018.2831661</citation>
<citation type="doi">10.1093/bioinformatics/btm468</citation>
<citation type="doi">10.1007/978-3-540-74960-8_12</citation>
</citations>
</tool>
119 changes: 119 additions & 0 deletions tools/revoluzer/distmat.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
<tool id="revoluzer_distmat" name="Compute distance matrix" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="20.01" license="MIT">
<description>for gene orders</description>
<macros>
<import>macros.xml</import>
</macros>
<expand macro="biotools"/>
<expand macro="requirements"/>
<version_command>distmat --version</version_command>
<command detect_errors="exit_code"><![CDATA[
distmat
-f '$f'
$structure
$sign
$distance
$duplicates
$output_cond.output_sel
#if $output_cond.output_sel == ""
$output_cond.header
#end if
> '$out'
]]></command>
<inputs>
<param argument="-f" type="data" format="fasta" label="Gene orders"/>
<param name="structure" type="select" label="Genome structure">
<option value="">Circular</option>
<option value="--lindir">Linear directed genomes (--lindir)</option>
<option value="--linund">Linear undirected genomes (--linund)</option>
</param>
<param argument="--sign" type="boolean" truevalue="--sign" falsevalue="" label="Genomes are circular"/>
<param name="distance" type="select" label="Distance" help="Note that the default on the old web site was to compute Breakpoint distances">
<option value="--crex">CREx</option>
<option value="">Inversion</option>
<option value="-b">Breakpoint</option>
<option value="-i -m">Common Intervals</option>
<option value="-i -m --lw">Length weigthed common intervals</option>
</param>
<param name="duplicates" type="select" label="Remove duplicate gene orders">
<option value="">No</option>
<option value="-d">Yes</option>
<option value="-D">Yes and print names of removed gene orders</option>
</param>
<conditional name="output_cond">
<param name="output_sel" type="select" label="Output type">
<option value="">Table</option>
<option value="--nexus">Nexus</option>
<option value="--list">List</option>
</param>
<when value="">
<param argument="--header" type="boolean" truevalue="--header" falsevalue="" label="Include header in table"/>
</when>
<when value="--nexus"/>
<when value="--list"/>
</conditional>
</inputs>
<outputs>
<data name="out" format="tabular"/>
</outputs>
<tests>
<test expect_num_outputs="1">
<param name="f" value="test.fas"/>
<output name="out">
<assert_contents>
<has_n_lines n="5"/>
<has_n_columns n="1"/> <!-- wo header 1st line is just the number of genomes -->
</assert_contents>
</output>
</test>
<test expect_num_outputs="1">
<param name="f" value="test.fas"/>
<param name="distance" value="Breakpoint"/>
<conditional name="output_cond">
<param name="header" value="true"/>
</conditional>
<output name="out">
<assert_contents>
<has_n_lines n="5"/>
<has_n_columns n="5"/>
</assert_contents>
</output>
</test>
</tests>
<help><![CDATA[

.. class:: infomark

**What it does**

Usage
.....

Compute a distance matrix for gene orders of unichromosomal genomes with equal duplication free gene content,
e.g. mitochondrial gene orders. Several distance measures are available.

- CREx distance (Bernt et al 2007) all other distance measures have been implemented
in this software package.
- Inversion distance (Bergeron, Heber & Jens Stoye 2002)
- Number of Breakpoints (e.g. Sankoff, Blanchette 1997)
- Common Intervals (Bergeron, A., Chauve, C., de Montgolfier, F., and Raffinot, M., 2008)
- Conserved Intervals (Bergeron, A., Blanchette, M., Chateau, A., and Chauve, C., 2004)

For the latter a distance is computed by subtracting the number from the maximum possible for the number of genes.

**Input**

@INPUT_FORMAT@

**Output**

A tabular file showing the distance matrix.

]]></help>
<citations>
<citation type="doi">10.1093/bioinformatics/btm468</citation>
<citation type="doi">10.1089/cmb.1998.5.555 </citation><!--Sankoff Blanchette-->
<citation type="doi">10.1093/bioinformatics/18.suppl_2.s54</citation>
<citation type="doi">10.1137/060651331</citation>
<citation type="doi">10.1007/978-3-540-30219-3_2</citation>
</citations>
</tool>
18 changes: 18 additions & 0 deletions tools/revoluzer/macros.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
<macros>
<token name="@TOOL_VERSION@">0.1.6</token>
<token name="@VERSION_SUFFIX@">0</token>
<xml name="biotools">
<xrefs>
<xref type="bio.tools">revoluzer</xref>
</xrefs>
</xml>
<xml name="requirements">
<requirements>
<requirement type="package" version="@TOOL_VERSION@">revoluzer</requirement>
</requirements>
</xml>
<token name="@INPUT_FORMAT@"><![CDATA[
Input is a gene order FASTA file. Instead of the sequence a space separated list of gene names
is given that may be prefixed with a minus sign to mark genes that are on the other strand.
]]></token>
</macros>
8 changes: 8 additions & 0 deletions tools/revoluzer/test-data/test.fas
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
> genome1
A B C D
> genome2
A C B D
> genome3
A -C -B D
> genome4
A -C B D