Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chaining proteinfold #176

Merged
merged 39 commits into from
Nov 27, 2024
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
29b2a4a
add matching ids
luisas Nov 18, 2024
ef90b74
clean
luisas Nov 18, 2024
8871516
update
luisas Nov 18, 2024
c65ec4b
Update input
luisas Nov 22, 2024
66ee7b2
Fix docs
luisas Nov 22, 2024
a315e30
Update Readme
luisas Nov 22, 2024
c0e5f29
Add documentation chaining pipelines
luisas Nov 22, 2024
4d11ded
update docs
luisas Nov 25, 2024
c4e5c1a
update schema
luisas Nov 25, 2024
6dd7aa8
update schema
luisas Nov 25, 2024
daeaa65
fix some lintin
luisas Nov 25, 2024
757a111
update modules
luisas Nov 25, 2024
63ffeff
merge dev
luisas Nov 25, 2024
67bc726
fix lint
luisas Nov 25, 2024
a3e72a9
update metromap
luisas Nov 25, 2024
69a04d7
fix config
luisas Nov 25, 2024
6eff7df
fix conf
luisas Nov 25, 2024
a5272c6
up
luisas Nov 25, 2024
5f2e4d7
up
luisas Nov 25, 2024
91a7e63
up
luisas Nov 25, 2024
e266fc1
up
luisas Nov 25, 2024
010a447
update modules
luisas Nov 26, 2024
0c7fa46
update docs
luisas Nov 26, 2024
f8e62bd
fix lint
luisas Nov 26, 2024
3b6ff65
update
luisas Nov 26, 2024
46dcc9d
up
luisas Nov 26, 2024
6c348d4
up
luisas Nov 26, 2024
386d973
Update README.md
luisas Nov 27, 2024
e068acc
Update docs/usage/chaining_with_proteinfold.md
luisas Nov 27, 2024
9928909
Update docs/usage/chaining_with_proteinfold.md
luisas Nov 27, 2024
052321d
Update workflows/multiplesequencealign.nf
luisas Nov 27, 2024
121e75f
Update workflows/multiplesequencealign.nf
luisas Nov 27, 2024
21037be
Update docs/usage/chaining_with_proteinfold.md
luisas Nov 27, 2024
54a0030
Update workflows/multiplesequencealign.nf
luisas Nov 27, 2024
802741a
Update workflows/multiplesequencealign.nf
luisas Nov 27, 2024
54e2f50
Update workflows/multiplesequencealign.nf
luisas Nov 27, 2024
66bf019
up
luisas Nov 27, 2024
91026c6
update docs
luisas Nov 27, 2024
ccb04b0
Update docs/usage/chaining_with_proteinfold.md
luisas Nov 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Initial release of nf-core/multiplesequencealign, created with the [nf-core](htt
- [[#147](https://github.com/nf-core/multiplesequencealign/pull/147)] - Add small testing profile + some fixes of the shiny app.
- [[#148](https://github.com/nf-core/multiplesequencealign/pull/148)] - Add UPP module.
- [[#150](https://github.com/nf-core/multiplesequencealign/pull/150)] - Update modules and readme for pre-release.
- [[#174](https://github.com/nf-core/multiplesequencealign/issues/174)] - Add the chaining of proteinfold output to MSA input.

### `Fixed`

Expand Down
15 changes: 6 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,8 @@ The pipeline performs the following steps:

## Usage

:::note
If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.
:::
> [!NOTE]
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.

#### 1. SAMPLESHEET

Expand All @@ -57,9 +56,8 @@ toxin,toxin.fa,toxin-ref.fa,toxin_structures,toxin_template.txt

Each row represents a set of sequences (in this case the seatoxin and toxin protein families) to be aligned and the associated (if available) reference alignments and dependency files (this can be anything from protein structure or any other information you would want to use in your favourite MSA tool).

:::note
The only required input is the id column and either fasta or dependencies.
:::
> [!NOTE]
> The only required input is the id column and either fasta or dependencies.

#### 2. TOOLSHEET

Expand All @@ -78,9 +76,8 @@ FAMSA, -gt upgma -medoidtree, FAMSA,
FAMSA,,REGRESSIVE,
```

:::note
The only required input is aligner.
:::
> [!NOTE]
> The only required input is aligner.
luisas marked this conversation as resolved.
Show resolved Hide resolved

#### 3. RUN THE PIPELINE

Expand Down
Binary file modified docs/images/nf-core-msa_metro_map.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
42 changes: 42 additions & 0 deletions docs/usage/chaining_with_proteinfold.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Using nf-core/proteinfold to generate the input protein structures

Structural aligners leverage protein structural information to render the MSA.

You can provide your PDB structures via the samplesheet, as outlined in the primary usage documentation. However, if you do not already have protein structures available, you may opt to use protein structure prediction tools to create these models.

To facilitate this, we offer seamless integration with the nf-core/proteinfold pipeline, enabling you to generate the protein structures required for this workflow.

To do so, you only need to build one samplesheet file, in the exact format required by nf-core/multiplesequencealign pipeline.
This is made compatible with nf-core/proteinfold and will predict and output the structures in the format required by the nf-core/multiplesquencealign pipeline.

Now, to run you simply can use the following code.
Please refer to the [proteinfold documentation](https://nf-co.re/proteinfold/1.0.0/) for picking your favourite params.
luisas marked this conversation as resolved.
Show resolved Hide resolved

Here we showcase how to run proteinfold in its colabfold local flavour - but it works for all the proteinfold modes.

```bash
nextflow run nf-core/proteinfold --input ./samplesheet.csv \
--outdir ./proteinfold_results \
--split_fasta \
-r dev \
--mode colabfold \
--colabfold_server local \
--colabfold_db <null (default) | PATH> \
--num_recycle 3 \
--use_amber <true/false> \
--colabfold_model_preset "AlphaFold2-ptm" \
--use_gpu <true/false> \
--db_load_mode 0
-profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
luisas marked this conversation as resolved.
Show resolved Hide resolved


nextflow run nf-core/multiplesequencealign --input ./samplesheet.csv \
--tools ./toolsheet.csv \
--dependencies_dir ./proteinfold_results/*/*/top_ranked_structures \
--outdir ./results \
-profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
luisas marked this conversation as resolved.
Show resolved Hide resolved

```

> [!NOTE]
> The one imporant parameter NOT to forget in proteinfold for the chaining is `--split_fasta`. This will allow to use a multifasta file as input for monomer predictions, needed by the MSA pipeline. Also, currently the changes needed for the chaining are only present in the dev branch of proteinfold, so also do not forget `-r dev`. The rest of the proteinfold parameters can and should be tuned according to your preferences for your proteinfold run. Please refer to the proteinfold documentation for this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a warning or a note saying that this is still a experimental feature on proteinfold that will be release in the near future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yess

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added now

File renamed without changes.
12 changes: 6 additions & 6 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"nf-core": {
"clustalo/align": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"git_sha": "2a8530b890878747f5063a894bad9fb2abd5c071",
"installed_by": ["modules"]
},
"clustalo/guidetree": {
Expand Down Expand Up @@ -99,12 +99,12 @@
},
"tcoffee/alncompare": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"git_sha": "ffa000ab3c33df25a165b5f9a039c4cbb665a77b",
"installed_by": ["modules"]
},
"tcoffee/consensus": {
"branch": "master",
"git_sha": "66b22564bc1bc0db7292f2073cdef954ead773e7",
"git_sha": "023e51187884ea6cc7290767486f551565f1b77a",
"installed_by": ["modules"]
},
"tcoffee/irmsd": {
Expand Down Expand Up @@ -143,17 +143,17 @@
"nf-core": {
"utils_nextflow_pipeline": {
"branch": "master",
"git_sha": "3aa0aec1d52d492fe241919f0c6100ebf0074082",
"git_sha": "c2b22d85f30a706a3073387f30380704fcae013b",
"installed_by": ["subworkflows"]
},
"utils_nfcore_pipeline": {
"branch": "master",
"git_sha": "1b6b9a3338d011367137808b49b923515080e3ba",
"git_sha": "1b89f75f1aa2021ec3360d0deccd0f6e97240551",
"installed_by": ["subworkflows"]
},
"utils_nfschema_plugin": {
"branch": "master",
"git_sha": "bbd5a41f4535a8defafe6080e00ea74c45f4f96c",
"git_sha": "2fd2cd6d0e7b273747f32e465fdc6bcc3ae0814e",
"installed_by": ["subworkflows"]
}
}
Expand Down
20 changes: 17 additions & 3 deletions modules/nf-core/clustalo/align/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 17 additions & 1 deletion modules/nf-core/clustalo/align/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

122 changes: 103 additions & 19 deletions modules/nf-core/clustalo/align/tests/main.nf.test

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading