Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize documentation for 3.0 #110

Merged
merged 5 commits into from
Nov 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 13 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@

Modular and user-friendly platform for AI-assisted rescoring of peptide identifications

> ⚠️ Note: This is the documentation for the fully redeveloped version 3.0 of MS²Rescore, which is
> now in the beta stage. While MS²Rescore 3.0 has been drastically improved over the previous
> version, you might run into some unforeseen issues. Please report any issues you encounter on the
> [issue tracker][issues] or post your questions on the [GitHub Discussions][discussions] forum.
> ⚠️ Note: This is the documentation for the fully redeveloped version 3.0 of MS²Rescore. While
> MS²Rescore 3.0 has been drastically improved over the previous version, you might run into some
> unforeseen issues. Please report any issues you encounter on the [issue tracker][issues] or post
> your questions on the [GitHub Discussions][discussions] forum.

## About MS²Rescore

Expand All @@ -25,13 +25,16 @@ identifications, which allows you to get **more peptide IDs** at the same false
number of peptide IDs. MS²Rescore is **ideal for challenging proteomics identification workflows**,
such as proteogenomics, metaproteomics, or immunopeptidomics.

![MS²Rescore overview](https://raw.githubusercontent.com/compomics/ms2rescore/main/docs/source/_static/img/ms2rescore-overview.png)

MS²Rescore can read peptide identifications in any format supported by [psm_utils][psm_utils]
(see [Supported file formats][file-formats]) and has been tested with various search engines output
files:

- [MS Amanda](http://ms.imp.ac.at/?goto=msamanda) `.csv`
- [Sage](https://github.com/lazear/sage) `.sage.tsv`
- [PeptideShaker](https://compomics.github.io/projects/peptide-shaker.html) `.mzid`
- [ProteomeDiscoverer](#)`.msf`
- [MSGFPlus](https://omics.pnl.gov/software/ms-gf) `.mzid`
- [Mascot](https://www.matrixscience.com/) `.mzid`
- [MaxQuant](https://www.maxquant.org/) `msms.txt`
Expand All @@ -45,6 +48,12 @@ MS²Rescore is available as a [desktop application][desktop], a [command line to

**Latest MS²Rescore publication:**

> **MS²Rescore 3.0 is a modular, flexible, and user-friendly platform to boost peptide identifications, as showcased with MS Amanda 3.0.**
> Louise Marie Buur*, Arthur Declercq*, Marina Strobl, Robbin Bouwmeester, Sven Degroeve, Lennart Martens, Viktoria Dorfer*, and Ralf Gabriels*.
> _ChemRxiv_ (2023) [doi:10.26434/chemrxiv-2023-rvr9n](https://doi.org/10.26434/chemrxiv-2023-rvr9n) <br/> \*contributed equally <span class="__dimensions_badge_embed__" data-doi="10.26434/chemrxiv-2023-rvr9n" data-hide-zero-citations="true" data-style="small_rectangle"></span>

**MS²Rescore for immunopeptidomics:**

> **MS2Rescore: Data-driven rescoring dramatically boosts immunopeptide identification rates.**
> Arthur Declercq, Robbin Bouwmeester, Aurélie Hirschler, Christine Carapito, Sven Degroeve, Lennart Martens, and Ralf Gabriels.
> _Molecular & Cellular Proteomics_ (2021) [doi:10.1016/j.mcpro.2022.100266](https://doi.org/10.1016/j.mcpro.2022.100266) <span class="__dimensions_badge_embed__" data-doi="10.1016/j.mcpro.2022.100266" data-hide-zero-citations="true" data-style="small_rectangle"></span>
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/img/gui-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/img/gui-screenshot-old.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/_static/img/gui-screenshot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/img/ms2rescore-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/img/qc-reports.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 10 additions & 2 deletions docs/source/api/ms2rescore.feature_generators.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@ ms2rescore.feature_generators.deeplc



ms2rescore.feature_generators.ms2pip
ms2rescore.feature_generators.ionmob
####################################

.. automodule:: ms2rescore.feature_generators.ms2pip
.. automodule:: ms2rescore.feature_generators.ionmob
:members:


Expand All @@ -48,3 +48,11 @@ ms2rescore.feature_generators.maxquant

.. automodule:: ms2rescore.feature_generators.maxquant
:members:



ms2rescore.feature_generators.ms2pip
####################################

.. automodule:: ms2rescore.feature_generators.ms2pip
:members:
200 changes: 199 additions & 1 deletion docs/source/gui.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,202 @@
Graphical user interface
************************

[TODO]

Installation
============

The MS²Rescore desktop application can be installed on Windows with a
:ref:`one-click installer <Windows installer>`. Alternatively, or on other platforms, follow the
:ref:`Python package installation instructions <Python package>`.


Starting the application
========================

If installed with the one-click installer, simply start MS²Rescore from the start menu or with the
desktop shortcut. Otherwise, start the application from the
:ref:`command line <command line interface>` with the command ``ms2rescore-gui`` or with
``python -m ms2rescore.gui``.


Application overview
====================

The MS²Rescore graphical user interface is divided into three main sections:

1. A side bar with references, window controls, and the current version number.
2. The configuration pane with input file selection, and parameter configuration.
3. The application log pane with the status output.

On the bottom of the window, the application log level can be selected. The log level determines
which messages are shown in the application log pane. On the bottom right, the application can be
started with the "Start" button. The "Stop" button can be used to stop the application at any time
during the execution.

.. figure:: ../_static/img/gui-overview.png
:width: 100%
:alt: MS²Rescore graphical user interface

Overview of the MS²Rescore desktop application.


Configuring MS²Rescore
======================

Input file selection
^^^^^^^^^^^^^^^^^^^^

The main input for MS²Rescore are the PSM file(s) (search engine output) and the spectrum file(s).
See :ref:`Input files` for more information.

One or more PSM files can be selected from the file system with the "Browse files" button under.
To make ensure correct reading of the file, specify the file type with from the drop-down menu.

.. figure:: ../_static/img/gui-example-xtandem-psm-file.png
:width: 60%
:alt: PSM file selection

PSM file selection


.. figure:: ../_static/img/gui-example-xtandem-psm-filetype.png
:width: 60%
:alt: PSM file type selection

PSM file type selection


To select a single spectrum file (mzML or MGF), click the "Browse files" button. To select a
folder with spectrum files, click the "Browse directories" button.

.. figure:: ../_static/img/gui-example-xtandem-spectra.png
:width: 60%
:alt: Spectrum file selection

Spectrum file selection


Optionally, for protein inference information, a FASTA file can also be provided. Ensure that
this file contains the same protein sequences as the search database used for the search engine.
If a FASTA file is provided, protein digestion settings may need to be configured in the rescoring
engine configuration.


Number of processes
^^^^^^^^^^^^^^^^^^^

The number of processes can be configured to run the application in parallel. The default is to
use all available CPU cores. The number of processes can be reduced to avoid overloading the
system or to avoid memory issues. A number under 16 is recommended.


Modification mapping
^^^^^^^^^^^^^^^^^^^^

Depending on the search engine, the peptide modification labels will have to be mapped
to labels that can be understood by MS²Rescore. For example, X!Tandem uses mass shift labels, such
as ``+57.02146`` for carbamidomethylation. However, tools such as DeepLC requires the atomic
composition for all modifications. As this cannot be derived from the mass shift (or other labels
that are not known to MS²Rescore), a mapping has to be provided.

.. figure:: ../_static/img/gui-example-xtandem-modifications-before.png
:width: 70%
:alt: Modification mapping

Modification mapping configuration. Click the plus sign to add more rows.


In modification mapping, click the plus sign to add more rows to the table, or click the minus sign
to remove rows. In the first column "Search engine label", enter the modification label as it
appears in the PSM file. In the second column "ProForma label", enter a ProForma-compatible
modification label. More information on accepted labels can be found in :ref:`Parsing modification
labels`.

.. figure:: ../_static/img/gui-example-xtandem-modifications-filled.png
:width: 70%
:alt: Modification mapping

Modification mapping configuration for the X!Tandem example. Mass shift labels from X!Tandem
are mapped to ProForma UniMod labels.


Fixed modifications
^^^^^^^^^^^^^^^^^^^

If the search engine PSM file does not contain information on which fixed modifications were used,
this must be specified in the MS²Rescore configuration. At the time of writing, only MaxQuant
``msms.txt``` files do not contain this information. For all other search engines, this information
is contained in the PSM file and the following field can be left empty.


Advanced options
^^^^^^^^^^^^^^^^

Most advanced options are only required for specific use cases or with specific search engine PSM
files. All options are listed in the :doc:`userguide/configuration` section of the user guide.

In the X!Tandem example, only the `PSM ID regex pattern` option is required. This option is used
to extract the spectrum ID from the PSM file. The spectrum ID is used to match the PSM to the
spectrum file. See :ref:`Mapping PSMs to spectra` for more information.

.. figure:: ../_static/img/gui-example-xtandem-advanced.png
:width: 70%
:alt: Advanced options

Advanced options


For reference, all parameters for the X!Tandem example are also listed in the example
configuration file on
`GitHub <https://github.com/compomics/ms2rescore/blob/main/examples/xtandem-ms2rescore.toml>`_.


Starting the rescoring process
==============================

After the configuration is complete, click the "Start" button to start the rescoring process.
The application will show the progress in the application log pane. The log level can be changed
before the run to show more or less information.

.. figure:: ../_static/img/gui-example-xtandem-progress.png
:width: 100%
:alt: Running application

Running application with log output


A pop up will appear when the application is finished, or when an error occurred. If an error
has occurred, the error message in the pop up should provide some insight into what went wrong.
If the error message is not clear, please report the issue on the
`GitHub issue tracker <https://github.com/compomics/ms2rescore/issues>`_ or post your question on
the `Discussion forum <https://github.com/compomics/ms2rescore/discussions>`_.

.. figure:: ../_static/img/gui-example-xtandem-finished.png
:width: 40%
:alt: Pop up when MS²Rescore is finished

Pop up when MS²Rescore is finished


Viewing the results
===================

After a successful run, the output files can be found in the directory of the input PSM file, or
in the specified output directory. The most important files are the ``*.ms2rescore.psms.tsv`` file,
which contains all PSMs with their new scores, and the ``*.ms2rescore.report.html`` file, which
contains interactive charts that visualize the results and various quality control metrics. See
:ref:`Output files` for more information.

.. figure:: ../_static/img/gui-example-xtandem-output-files.png
:width: 100%
:alt: Output files

Overview of the output files after rescoring the X!Tandem example.

Double click the ``*.ms2rescore.report.html`` file to open it in the default web browser:

.. figure:: ../_static/img/qc-reports.png
:width: 100%
:alt: Rescoring report

Rescoring QC report with interactive charts.
19 changes: 18 additions & 1 deletion docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Docker container
:target: https://quay.io/repository/biocontainers/ms2rescore

First check the latest version tag on
`biocontainers/ms2rescore/tags <https://quay.io/repository/biocontainers/ms2rescore?tab=tags>`__.
`biocontainers/ms2rescore/tags <https://quay.io/repository/biocontainers/ms2rescore?tab=tags>`_.
Then pull and run the container with:

.. code-block:: bash
Expand All @@ -60,6 +60,23 @@ files, ``<tag>`` is the container version tag, and ``<ms2rescore-arguments>`` ar
command line options (see :ref:`Command line interface`).


Installing Percolator
=====================

To use :ref:`percolator` as rescoring engine, it must be installed separately. Percolator is
available for most platforms and can be downloaded from the
`GitHub releases page <https://github.com/percolator/percolator/releases/latest>`_. Ensure that
the ``percolator`` executable is in your ``PATH``. On Windows, this can be done by checking the
``Add percolator to the system PATH for current user`` option during installation:

.. figure:: ../_static/img/percolator-install-path.png
:width: 60%
:alt: Percolator installation on Windows

.. note::
Alternatively, :ref:`mokapot` can be used as rescoring engine, which does not require a separate
installation.

For development
===============

Expand Down
4 changes: 4 additions & 0 deletions docs/source/userguide/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,10 @@ be configured separately. For instance:
:alt: fixed modifications configuration in GUI


.. caution::
Most search engines DO return fixed modifications as part of the modified peptide sequences.
In these cases, they must NOT be added to the ``fixed_modifications`` configuration.


Mapping PSMs to spectra
=======================
Expand Down
32 changes: 19 additions & 13 deletions docs/source/userguide/input-files.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,30 @@
Input files
###########

PSM file
========
PSM file(s)
===========

[todo]
The peptide-spectrum match (PSM) file is generally the output from a proteomics search engine.
This file serves as the main input to MS²Rescore. One or multiple PSM files can be provided at
once. Note that merging PSMs from different MS runs could have an impact on the correctness of
the FDR control.

As a general rule, MS²Rescore always needs access to all target and decoy PSMs, not
only the FDR-filtered targets.
Various PSM file types are supported. The type can be specified with the ``psm_file_type`` option.
Check the list of :py:mod:`psm_utils` tags in the
:external+psm_utils:ref:`supported file formats <supported file formats>` section. Depending on the
file extension, the file type can also be inferred from the file name. In that case,
``psm_file_type`` option can be set to ``infer``.

The ``psm_file_type`` can be one of the :py:mod:`psm_utils` tags as listed in the
:external+psm_utils:ref:`supported file formats <supported file formats>`. Depending on the file
extension, the file type can also be inferred from the file name. In that case, ``psm_file_type``
option can be set to ``infer``.
.. attention::
As a general rule, MS²Rescore always needs access to **all target and decoy PSMs, without any
FDR-filtering**. For some search engines, this means that the FDR-filter should be disabled or
set to 100%.


Spectrum file(s)
================

[todo]

If the ``spectrum_path`` is a directory, MS²Rescore will search for spectrum files in the
directory according to the run names in the PSM file.
Spectrum files are required for some feature generators. Both ``mzML`` and ``mgf`` formats are
supported. The ``spectrum_path`` option can be either a single file or a folder. If the
``spectrum_path`` is a folder, MS²Rescore will search for spectrum files in the directory according
to the run names in the PSM file.
Loading