Skip to content

Commit

Permalink
Finalize documentation for release
Browse files Browse the repository at this point in the history
  • Loading branch information
RalfG committed Nov 21, 2023
1 parent 3038ba5 commit 53d7c5b
Show file tree
Hide file tree
Showing 22 changed files with 320 additions and 32 deletions.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ files:
- [MS Amanda](http://ms.imp.ac.at/?goto=msamanda) `.csv`
- [Sage](https://github.com/lazear/sage) `.sage.tsv`
- [PeptideShaker](https://compomics.github.io/projects/peptide-shaker.html) `.mzid`
- [ProteomeDiscoverer](#)`.msf`
- [MSGFPlus](https://omics.pnl.gov/software/ms-gf) `.mzid`
- [Mascot](https://www.matrixscience.com/) `.mzid`
- [MaxQuant](https://www.maxquant.org/) `msms.txt`
Expand All @@ -45,6 +46,13 @@ MS²Rescore is available as a [desktop application][desktop], a [command line to

**Latest MS²Rescore publication:**

> **MS²Rescore 3.0 is a modular, flexible, and user-friendly platform to boost peptide identifications, as showcased with MS Amanda 3.0.**
> Louise Marie Buur*, Arthur Declercq*, Marina Strobl, Robbin Bouwmeester, Sven Degroeve, Lennart Martens, Viktoria Dorfer*, and Ralf Gabriels*.
> _ChemRxiv_ (2023) [doi:10.26434/chemrxiv-2023-rvr9n](https://doi.org/10.26434/chemrxiv-2023-rvr9n) <br/>
> *contributed equally <span class="__dimensions_badge_embed__" data-doi="10.26434/chemrxiv-2023-rvr9n" data-hide-zero-citations="true" data-style="small_rectangle"></span>
**MS²Rescore for immunopeptidomics:**

> **MS2Rescore: Data-driven rescoring dramatically boosts immunopeptide identification rates.**
> Arthur Declercq, Robbin Bouwmeester, Aurélie Hirschler, Christine Carapito, Sven Degroeve, Lennart Martens, and Ralf Gabriels.
> _Molecular & Cellular Proteomics_ (2021) [doi:10.1016/j.mcpro.2022.100266](https://doi.org/10.1016/j.mcpro.2022.100266) <span class="__dimensions_badge_embed__" data-doi="10.1016/j.mcpro.2022.100266" data-hide-zero-citations="true" data-style="small_rectangle"></span>
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/img/gui-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/img/gui-screenshot-old.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/_static/img/gui-screenshot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 10 additions & 2 deletions docs/source/api/ms2rescore.feature_generators.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@ ms2rescore.feature_generators.deeplc



ms2rescore.feature_generators.ms2pip
ms2rescore.feature_generators.ionmob
####################################

.. automodule:: ms2rescore.feature_generators.ms2pip
.. automodule:: ms2rescore.feature_generators.ionmob
:members:


Expand All @@ -48,3 +48,11 @@ ms2rescore.feature_generators.maxquant

.. automodule:: ms2rescore.feature_generators.maxquant
:members:



ms2rescore.feature_generators.ms2pip
####################################

.. automodule:: ms2rescore.feature_generators.ms2pip
:members:
192 changes: 191 additions & 1 deletion docs/source/gui.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,194 @@
Graphical user interface
************************

[TODO]

Installation
============

The MS²Rescore desktop application can be installed on Windows with a
:ref:`one-click installer <Windows installer>`. Alternatively, or on other platforms, follow the
:ref:`Python package installation instructions <Python package>`.


Starting the application
========================

If installed with the one-click installer, simply start MS²Rescore from the start menu or with the
desktop shortcut. Otherwise, start the application from the
:ref:`command line <command line interface>` with the command ``ms2rescore-gui`` or with
``python -m ms2rescore.gui``.


Application overview
====================

The MS²Rescore graphical user interface is divided into three main sections:

1. A side bar with references, window controls, and the current version number.
2. The configuration pane with input file selection, and parameter configuration.
3. The application log pane with the status output.

On the bottom of the window, the application log level can be selected. The log level determines
which messages are shown in the application log pane. On the bottom right, the application can be
started with the "Start" button. The "Stop" button can be used to stop the application at any time
during the execution.

.. figure:: ../_static/img/gui-overview.png
:width: 100%
:alt: MS²Rescore graphical user interface

Overview of the MS²Rescore desktop application.


Configuring MS²Rescore
======================

Input file selection
^^^^^^^^^^^^^^^^^^^^

The main input for MS²Rescore are the PSM file(s) (search engine output) and the spectrum file(s).
See :ref:`Input files` for more information.

One or more PSM files can be selected from the file system with the "Browse files" button under.
To make ensure correct reading of the file, specify the file type with from the drop-down menu.

.. figure:: ../_static/img/gui-example-xtandem-psm-file.png
:width: 60%
:alt: PSM file selection

PSM file selection


.. figure:: ../_static/img/gui-example-xtandem-psm-filetype.png
:width: 60%
:alt: PSM file type selection

PSM file type selection


To select a single spectrum file (mzML or MGF), click the "Browse files" button. To select a
folder with spectrum files, click the "Browse directories" button.

.. figure:: ../_static/img/gui-example-xtandem-spectra.png
:width: 60%
:alt: Spectrum file selection

Spectrum file selection


Optionally, for protein inference information, a FASTA file can also be provided. Ensure that
this file contains the same protein sequences as the search database used for the search engine.
If a FASTA file is provided, protein digestion settings may need to be configured in the rescoring
engine configuration.


Number of processes
^^^^^^^^^^^^^^^^^^^

The number of processes can be configured to run the application in parallel. The default is to
use all available CPU cores. The number of processes can be reduced to avoid overloading the
system or to avoid memory issues. A number under 16 is recommended.


Modification mapping
^^^^^^^^^^^^^^^^^^^^

Depending on the search engine, the peptide modification labels will have to be mapped
to labels that can be understood by MS²Rescore. For example, X!Tandem uses mass shift labels, such
as ``+57.02146`` for carbamidomethylation. However, tools such as DeepLC requires the atomic
composition for all modifications. As this cannot be derived from the mass shift (or other labels
that are not known to MS²Rescore), a mapping has to be provided.

.. figure:: ../_static/img/gui-example-xtandem-modifications-before.png
:width: 70%
:alt: Modification mapping

Modification mapping configuration. Click the plus sign to add more rows.


In modification mapping, click the plus sign to add more rows to the table, or click the minus sign
to remove rows. In the first column "Search engine label", enter the modification label as it
appears in the PSM file. In the second column "ProForma label", enter a ProForma-compatible
modification label. More information on accepted labels can be found in :ref:`Parsing modification
labels`.

.. figure:: ../_static/img/gui-example-xtandem-modifications-filled.png
:width: 70%
:alt: Modification mapping

Modification mapping configuration for the X!Tandem example. Mass shift labels from X!Tandem
are mapped to ProForma UniMod labels.


Fixed modifications
^^^^^^^^^^^^^^^^^^^

If the search engine PSM file does not contain information on which fixed modifications were used,
this must be specified in the MS²Rescore configuration. At the time of writing, only MaxQuant
``msms.txt``` files do not contain this information. For all other search engines, this information
is contained in the PSM file and the following field can be left empty.


Advanced options
^^^^^^^^^^^^^^^^

Most advanced options are only required for specific use cases or with specific search engine PSM
files. All options are listed in the :doc:`userguide/configuration` section of the user guide.

In the X!Tandem example, only the `PSM ID regex pattern` option is required. This option is used
to extract the spectrum ID from the PSM file. The spectrum ID is used to match the PSM to the
spectrum file. See :ref:`Mapping PSMs to spectra` for more information.

.. figure:: ../_static/img/gui-example-xtandem-advanced.png
:width: 70%
:alt: Advanced options

Advanced options


For reference, all parameters for the X!Tandem example are also listed in the example
configuration file on
`GitHub <https://github.com/compomics/ms2rescore/blob/main/examples/xtandem-ms2rescore.toml>`_.


Starting the rescoring process
==============================

After the configuration is complete, click the "Start" button to start the rescoring process.
The application will show the progress in the application log pane. The log level can be changed
before the run to show more or less information.

.. figure:: ../_static/img/gui-example-xtandem-progress.png
:width: 100%
:alt: Running application

Running application with log output


A pop up will appear when the application is finished, or when an error occurred. If an error
has occurred, the error message in the pop up should provide some insight into what went wrong.
If the error message is not clear, please report the issue on the
`GitHub issue tracker <https://github.com/compomics/ms2rescore/issues>`_ or post your question on
the `Discussion forum <https://github.com/compomics/ms2rescore/discussions>`_.

.. figure:: ../_static/img/gui-example-xtandem-finished.png
:width: 40%
:alt: Pop up when MS²Rescore is finished

Pop up when MS²Rescore is finished


Viewing the results
===================

After a successful run, the output files can be found in the directory of the input PSM file, or
in the specified output directory. The most important files are the ``*.ms2rescore.psms.tsv`` file,
which contains all PSMs with their new scores, and the ``*.ms2rescore.report.html`` file, which
contains interactive charts that visualize the results and various quality control metrics. See
:ref:`Output files` for more information.

.. figure:: ../_static/img/gui-example-xtandem-output-files.png
:width: 100%
:alt: Output files

Overview of the output files after rescoring the X!Tandem example.
4 changes: 4 additions & 0 deletions docs/source/userguide/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,10 @@ be configured separately. For instance:
:alt: fixed modifications configuration in GUI


.. caution::
Most search engines DO return fixed modifications as part of the modified peptide sequences.
In these cases, they must NOT be added to the ``fixed_modifications`` configuration.


Mapping PSMs to spectra
=======================
Expand Down
32 changes: 19 additions & 13 deletions docs/source/userguide/input-files.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,30 @@
Input files
###########

PSM file
========
PSM file(s)
===========

[todo]
The peptide-spectrum match (PSM) file is generally the output from a proteomics search engine.
This file serves as the main input to MS²Rescore. One or multiple PSM files can be provided at
once. Note that merging PSMs from different MS runs could have an impact on the correctness of
the FDR control.

As a general rule, MS²Rescore always needs access to all target and decoy PSMs, not
only the FDR-filtered targets.
Various PSM file types are supported. The type can be specified with the ``psm_file_type`` option.
Check the list of :py:mod:`psm_utils` tags in the
:external+psm_utils:ref:`supported file formats <supported file formats>` section. Depending on the
file extension, the file type can also be inferred from the file name. In that case,
``psm_file_type`` option can be set to ``infer``.

The ``psm_file_type`` can be one of the :py:mod:`psm_utils` tags as listed in the
:external+psm_utils:ref:`supported file formats <supported file formats>`. Depending on the file
extension, the file type can also be inferred from the file name. In that case, ``psm_file_type``
option can be set to ``infer``.
.. attention::
As a general rule, MS²Rescore always needs access to **all target and decoy PSMs, without any
FDR-filtering**. For some search engines, this means that the FDR-filter should be disabled or
set to 100%.


Spectrum file(s)
================

[todo]

If the ``spectrum_path`` is a directory, MS²Rescore will search for spectrum files in the
directory according to the run names in the PSM file.
Spectrum files are required for some feature generators. Both ``mzML`` and ``mgf`` formats are
supported. The ``spectrum_path`` option can be either a single file or a folder. If the
``spectrum_path`` is a folder, MS²Rescore will search for spectrum files in the directory according
to the run names in the PSM file.
59 changes: 58 additions & 1 deletion docs/source/userguide/output-files.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,61 @@
Output files
############

[todo]
Depending on the options you choose, the following files will be created. All PSMs, peptides, and
proteins are not yet filtered at any false discovery rate (FDR) level.

Main output files:

+-----------------------------------+----------------------------------------------------------------------------------+
| File | Description |
+===================================+==================================================================================+
| ``<prefix>.psms.tsv`` | Main output file with rescored PSMs and their new scores |
+-----------------------------------+----------------------------------------------------------------------------------+
| ``<prefix>.report.html`` | HTML report with interactive plots showing the results and some quality control |
| | metrics. |
+-----------------------------------+----------------------------------------------------------------------------------+

Log and configuration files:

+--------------------------------------+--------------------------------------------------------------------------------------+
| File | Description |
+======================================+======================================================================================+
| ``<prefix>.log.txt`` | Log file with information about the run |
+--------------------------------------+--------------------------------------------------------------------------------------+
| ``<prefix>.log.html`` | HTML version of the log file |
+--------------------------------------+--------------------------------------------------------------------------------------+
| ``<prefix>.full-config.json`` | Full configuration file with all the parameters used |
| | as configured in the user-provided configuration file, the command line or graphical |
| | interface, and the default values. |
+--------------------------------------+--------------------------------------------------------------------------------------+
| ``<prefix>.feature_names.tsv`` | List of the features and their descriptions |
+--------------------------------------+--------------------------------------------------------------------------------------+

Rescoring engine files:

+-------------------------------------------------------------+-------------------------------------------------------------+
| File | Description |
+=============================================================+=============================================================+
| ``<prefix>.<mokapot/percolator>.psms.txt`` | PSMs and their new scores at PSM-level FDR. |
+-------------------------------------------------------------+-------------------------------------------------------------+
| ``<prefix>.<mokapot/percolator>.peptides.txt`` | Peptides and their new scores at peptide-level FDR. |
+-------------------------------------------------------------+-------------------------------------------------------------+
| ``<prefix>.<mokapot/percolator>.proteins.txt`` | Proteins and their new scores at protein-level FDR. |
+-------------------------------------------------------------+-------------------------------------------------------------+
| ``<prefix>.<mokapot/percolator>.decoy.psms.txt`` | Decoy PSMs and their new scores at PSM-level FDR. |
+-------------------------------------------------------------+-------------------------------------------------------------+
| ``<prefix>.<mokapot/percolator>.decoy.peptides.txt`` | Decoy peptides and their new scores at peptide-level FDR. |
+-------------------------------------------------------------+-------------------------------------------------------------+
| ``<prefix>.<mokapot/percolator>.decoy.proteins.txt`` | Decoy proteins and their new scores at protein-level FDR. |
+-------------------------------------------------------------+-------------------------------------------------------------+
| ``<prefix>.<mokapot/percolator>.weights.txt`` | Feature weights, showing feature usage in the rescoring run |
+-------------------------------------------------------------+-------------------------------------------------------------+

If no rescoring engine is selected (or if Percolator was selected), the following files will also
be written:

+-------------------------------------------------------------+-----------------------------------------------------------+
| File | Description |
+=============================================================+===========================================================+
| ``<prefix>.pin`` | PSMs with all features for rescoring |
+-------------------------------------------------------------+-----------------------------------------------------------+
18 changes: 16 additions & 2 deletions ms2rescore/feature_generators/ionmob.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,20 @@
import contextlib
"""
``ionmob`` collisional cross section (CCS)-based feature generator.
``ionmob`` is a predictor for peptide collisional cross sections (CCS), as measured in ion mobility
devices, such as the Bruker timsTOF instruments. More info can be found on the
`ionmob GitHub page <https://github.com/theGreatHerrLebert/ionmob>`_.
If you use ``ionmob`` in your work, please cite the following publication:
.. epigraph::
Teschner, D. et al. Ionmob: a Python package for prediction of peptide collisional
cross-section values. *Bioinformatics* 39, btad486 (2023).
`doi:10.1093/bioinformatics/btad486 <https://doi.org/10.1093/bioinformatics/btad486>`_
"""

import logging
import os
from itertools import chain
from pathlib import Path
from typing import Dict, Optional
Expand Down
Loading

0 comments on commit 53d7c5b

Please sign in to comment.