Skip to content

Commit

Permalink
Merge pull request #161 from compomics/timsRescore
Browse files Browse the repository at this point in the history
TIMS²Rescore-related fixes and improvements
  • Loading branch information
RalfG authored Jul 20, 2024
2 parents 77e7c1b + 696165a commit 7ccb280
Show file tree
Hide file tree
Showing 10 changed files with 128 additions and 43 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ jobs:
- name: Install package and dependencies
run: |
python -m pip install --upgrade pip
pip install . pyinstaller
pip install --only-binary :all: . pyinstaller
- name: Install Inno Setup
uses: crazy-max/ghaction-chocolatey@v1
Expand Down
19 changes: 18 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,17 @@ files:
MS²Rescore is available as a [desktop application][desktop], a [command line tool][cli], and a
[modular Python API][python-package].

## TIMS²Rescore: Direct support for DDA-PASEF data

MS²Rescore v3.1+ includes TIMS²Rescore, a usage mode with specialized default configurations for
DDA-PASEF data from timsTOF instruments. TIMS²Rescore makes use of new MS²PIP prediction models for
timsTOF fragmentation and IM2Deep for ion mobility separation. Bruker .d and miniTDF spectrum
files are directly supported through the [timsrust](https://github.com/MannLabs/timsrust) library.

Checkout our [preprint](https://doi.org/10.1101/2024.05.29.596400) for more information and the
[TIMS²Rescore documentation](https://ms2rescore.readthedocs.io/en/stable/userguide/tims2rescore)
to get started.

## Citing

**Latest MS²Rescore publication:**
Expand All @@ -54,10 +65,16 @@ MS²Rescore is available as a [desktop application][desktop], a [command line to
**MS²Rescore for immunopeptidomics:**

> **MS2Rescore: Data-driven rescoring dramatically boosts immunopeptide identification rates.**
> **MS²Rescore: Data-driven rescoring dramatically boosts immunopeptide identification rates.**
> Arthur Declercq, Robbin Bouwmeester, Aurélie Hirschler, Christine Carapito, Sven Degroeve, Lennart Martens, and Ralf Gabriels.
> _Molecular & Cellular Proteomics_ (2021) [doi:10.1016/j.mcpro.2022.100266](https://doi.org/10.1016/j.mcpro.2022.100266) <span class="__dimensions_badge_embed__" data-doi="10.1016/j.mcpro.2022.100266" data-hide-zero-citations="true" data-style="small_rectangle"></span>
**MS²Rescore for timsTOF DDA-PASEF data:**

> **TIMS²Rescore: A DDA-PASEF optimized data-driven rescoring pipeline based on MS²Rescore.**
> Arthur Declercq*, Robbe Devreese*, Jonas Scheid, Caroline Jachmann, Tim Van Den Bossche, Annica Preikschat, David Gomez-Zepeda, Jeewan Babu Rijal, Aurélie Hirschler, Jonathan R Krieger, Tharan Srikumar, George Rosenberger, Dennis Trede, Christine Carapito, Stefan Tenzer, Juliane S Walz, Sven Degroeve, Robbin Bouwmeester, Lennart Martens, and Ralf Gabriels.
> _bioRxiv_ (2024) [doi:10.1101/2024.05.29.596400](https://doi.org/10.1101/2024.05.29.596400) <span class="__dimensions_badge_embed__" data-doi="10.1101/2024.05.29.596400" data-hide-zero-citations="true" data-style="small_rectangle"></span>
**Original publication describing the concept of rescoring with predicted spectra:**

> **Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions.**
Expand Down
13 changes: 13 additions & 0 deletions docs/source/userguide/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,19 @@ preferably provide the formula instead of a mass shift, as the mass shift can al
be calculated from the formula, but not vice-versa, and some feature generators (such as DeepLC)
require the modification formula.

.. role:: raw-html(raw)
:format: html

Formula modification labels can be defined with the ``Formula:`` prefix, followed by each atom
symbol and its count, denoting which atoms are added or removed by the modification. If no count is
provided, it is assumed to be 1. For example, ``Formula:HO3P`` is equivalent to ``Formula:H1O3P1``.
For isotopes, prefix the atom symbol with the isotope number and place the entire block (isotope
number, atom symbol, and number of atoms) in square brackets. For example, the SILAC 13C(2) 15N(1)
label (`UNIMOD:2088 <https://unimod.org/modifications_view.php?editid1=2088>`_)
would be notated as ``Formula:C-2[13C2]N-1[15N]``, meaning that two C atoms are removed, two
:raw-html:`<sup>13</sup>C` atoms are added, one N atom is removed and one
:raw-html:`<sup>15</sup>N` atom is added.

And example of the :py:obj:`modification_mapping` could be:

.. tab:: JSON
Expand Down
69 changes: 42 additions & 27 deletions docs/source/userguide/tims2Rescore.rst
Original file line number Diff line number Diff line change
@@ -1,61 +1,76 @@
.. _timsrescore:
.. _tims2rescore:

TIMS²Rescore User Guide
=======================
TIMS²Rescore
============

Introduction
------------

The `TIMS²Rescore` tool is a DDA-PASEF adapted version of `ms2rescore` that allows users to perform rescoring of peptide-spectrum matches (PSMs) acquired on Bruker instruments. This guide provides an overview of how to use `timsrescore` in `ms2rescore` effectively.
`TIMS²Rescore` is a specialized version of `MS²Rescore` for timsTOF DDA-PASEF data. This guide
provides an overview of how to use TIMS²Rescore effectively.

Installation
------------

Before using `timsrescore`, ensure that you have `ms2rescore` installed on your system. You can install `ms2rescore` using the following command:

.. code-block:: bash
Installing TIMS²Rescore
-----------------------

pip install ms2rescore
TIMS²Rescore is part of the ``ms2rescore`` package. Check out the :ref:`installation` instructions
to get started.

Usage
-----

To use `timsrescore`, follow these steps:
To use TIMS²Rescore, follow these steps:

1. Prepare your input files:
- Ensure that you have the necessary input files, including the PSM file spectrum files
- Make sure that the PSM file format from a supported search engine or a standard format like .mzid(:external+psm_utils:ref:`supported file formats <supported file formats>`).
- Spectrum files can directly be given as .d or minitdf files from Bruker instruments or first converted to .mzML format.

2. Run `timsrescore`:
- To boost DDA-PASEF peptide identifications, TIMS²Rescore requires the spectrum files from
the timsTOF instrument and the PSM files with identifications from a supported search engine.
- Make sure that the PSM file format comes from a supported search engine or is a standard
format such as mzIdentML (See
:external+psm_utils:ref:`supported file formats <supported file formats>`).
- Spectrum files can directly be passed as ``.d`` or `miniTDF` raw data or can optionally be
first converted to mzML or MGF. We recommend using the format that was passed to the search
engine.

2. Run ``tims2rescore``:
- Open a terminal or command prompt.
- Navigate to the directory where your input files are located.
- Execute the following command:

.. code-block:: bash
timsrescore -p <path_to_psm_file> -s <path_to_spectrum_file> -o <path_to_output_file>
tims2rescore -p <path_to_psm_file> -s <path_to_spectrum_file>
Replace `<path_to_psm_file>`, `<path_to_tims_file>`, and `<path_to_output_file>` with the
actual paths to your input and output files.

Replace `<path_to_psm_file>`, `<path_to_tims_file>`, and `<path_to_output_file>` with the actual paths to your input and output files.
_NOTE_ By default timsTOF specific models will be used for predictions. Optionally you can further configure settings through a configuration file. For more information on configuring `timsrescore`, refer to the :doc:`configuration` tab in the user guide.
.. admonition:: note

By default, specialized timsTOF models will be used for predictions. Optionally you can
further configure TIMS²Rescore through a configuration file. For more information, refer
to the :ref:`configuration` tab in the user guide.

3. Review the results:
- Once the `timsrescore` process completes, you will find the rescoring results in the specified output file or if not specified in the same directory as the input files
- If you want a detailed overview of the performance, you can either give the set `write_report` to `True` in the configuration file, use the `--write_report` option in the command line or run the following command:

- Once the ``tims2rescore`` process completes, you will find the rescoring results in the
same directory as the input files.
- If you want a detailed report of the rescoring performance, you can either give the set
`write_report` to `True` in the configuration file, use the `--write_report` option in the
``tims2rescore`` command line. Alternatively, run the following command after rescoring:

.. code-block:: bash
ms2rescore-report <output_prefix>
Replace `<output_prefix>` with the actual output prefix of the result files to the output file.
Replace `<output_prefix>` with the actual output prefix of the result files to the output
file. For instance, if the output file is ``identifications.psms.tsv``, then the output
prefix is ``identifications``.

Additional Options
Additional options
------------------

`ms2rescore` provides additional options to customize the `timsrescore` process. You can explore these options by running the following command:
`tims2rescore` provides additional options to customize rescoring. You can explore these options
by running the following command:

.. code-block:: bash
timsrescore --help
tims2rescore --help
2 changes: 1 addition & 1 deletion ms2rescore/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""MS²Rescore: Sensitive PSM rescoring with predicted MS² peak intensities and RTs."""

__version__ = "3.1.0-dev9"
__version__ = "3.1.0"

from warnings import filterwarnings

Expand Down
18 changes: 12 additions & 6 deletions ms2rescore/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,18 +41,24 @@ def _print_credits(tims=False):
text = Text()
text.append("\n")
if tims:
text.append("TIMS²Rescore", style="bold link https://github.com/compomics/ms2rescore")
text.append("TIMS²Rescore", style="bold link https://github.com/compomics/tims2rescore")
else:
text.append("MS²Rescore", style="bold link https://github.com/compomics/ms2rescore")
text.append(f" (v{__version__})\n", style="bold")
if tims:
text.append("MS²Rescore tuned for Bruker timsTOF instruments.\n", style="italic")
text.append("MS²Rescore tuned for timsTOF DDA-PASEF data.\n", style="italic")
text.append("Developed at CompOmics, VIB / Ghent University, Belgium.\n")
text.append("Please cite: ")
text.append(
"Buur & Declercq et al. JPR (2024)",
style="link https://doi.org/10.1021/acs.jproteome.3c00785",
)
if tims:
text.append(
"Declercq & Devreese et al. bioRxiv (2024)",
style="link https://doi.org/10.1101/2024.05.29.596400",
)
else:
text.append(
"Buur & Declercq et al. JPR (2024)",
style="link https://doi.org/10.1021/acs.jproteome.3c00785",
)
text.append("\n")
if tims:
text.stylize("#006cb5")
Expand Down
8 changes: 4 additions & 4 deletions ms2rescore/parse_psms.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,9 @@ def parse_psms(config: Dict, psm_list: Union[PSMList, None]) -> PSMList:
_calculate_qvalues(psm_list, config["lower_score_is_better"])
if config["psm_id_rt_pattern"] or config["psm_id_im_pattern"]:
logger.debug("Parsing retention time and/or ion mobility from PSM identifier...")
_parse_values_from_spectrum_id(config, psm_list)
_parse_values_from_spectrum_id(
psm_list, config["psm_id_rt_pattern"], config["psm_id_im_pattern"]
)

# Store scoring values for comparison later
for psm in psm_list:
Expand Down Expand Up @@ -165,9 +167,7 @@ def _parse_values_from_spectrum_id(
["retention_time", "ion_mobility"],
):
if pattern:
logger.debug(
f"Parsing {label} from spectrum_id with regex pattern " f"{psm_id_rt_pattern}"
)
logger.debug(f"Parsing {label} from spectrum_id with regex pattern " f"{pattern}")
try:
pattern = re.compile(pattern)
psm_list[key] = [
Expand Down
30 changes: 29 additions & 1 deletion ms2rescore/report/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,7 @@ def _get_features_context(
observed_column="ccs_observed_im2deep",
xaxis_label="Observed CCS",
yaxis_label="Predicted CCS",
plot_title="Predicted vs. observed CCS",
plot_title="Predicted vs. observed CCS - IM2Deep",
)

context["charts"].append(
Expand All @@ -361,6 +361,34 @@ def _get_features_context(
"chart": scatter_chart.to_html(**PLOTLY_HTML_KWARGS),
}
)

# ionmob specific charts
if "ionmob" in feature_names:
try:
import deeplc.plot

scatter_chart = deeplc.plot.scatter(
df=features[(~psm_list["is_decoy"]) & (psm_list["qvalue"] <= 0.01)],
predicted_column="ccs_predicted",
observed_column="ccs_observed",
xaxis_label="Observed CCS",
yaxis_label="Predicted CCS",
plot_title="Predicted vs. observed CCS - ionmob",
)

context["charts"].append(
{
"title": TEXTS["charts"]["ionmob_performance"]["title"],
"description": TEXTS["charts"]["ionmob_performance"]["description"],
"chart": scatter_chart.to_html(**PLOTLY_HTML_KWARGS),
}
)

# TODO: for now, ionmob plot will only be available if deeplc is installed. Since ionmob does not have a dependency on deeplc, this should be changed in the future.
except ImportError:
logger.warning(
"Could not import deeplc.plot, skipping ionmob CCS prediction performance plot. Please install DeepLC to generate this plot."
)
return context


Expand Down
6 changes: 6 additions & 0 deletions ms2rescore/report/templates/texts.toml
Original file line number Diff line number Diff line change
Expand Up @@ -111,3 +111,9 @@ title = "IM2Deep model performance"
description = """
IM2Deep model performance can be visualized by plotting the predicted CCS against the observed CCS.
"""

[charts.ionmob_performance]
title = "ionmob model performance"
description = """
ionmob model performance can be visualized by plotting the predicted CCS against the observed CCS.
"""
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ dependencies = [
"jinja2>=3",
"lxml>=4.5",
"mokapot>=0.9",
"ms2pip>=4.0.0-dev10",
"ms2rescore_rs",
"ms2pip>=4.0.0",
"ms2rescore_rs>=0.3.0",
"numpy>=1.16.0",
"pandas>=1.0",
"plotly>=5",
Expand Down

0 comments on commit 7ccb280

Please sign in to comment.