Skip to content

Latest commit

 

History

History
223 lines (162 loc) · 15.5 KB

README.md

File metadata and controls

223 lines (162 loc) · 15.5 KB

toffy

The toffy repo is designed to simplify the process of generating and processing data on the MIBIScope platform.

This repo is currently in beta testing. None of the code has been published yet, and we will be making breaking changes frequently. If you find bugs, please open an issue. If you have questions or want to collaborate, please reach out to Noah ([email protected])

Table of Contents

Overview

The repo has four main parts, with associated code and jupyter notebooks for each. We have also recorded workshop talks which complement the repository. MIBI Workshop (Pre-Recorded Lectures) Playlist.

1. Using toffy for the first time

The first time you use toffy on one of the commercial instruments, you'll need to perform some basic tasks to ensure everything is working properly. The set up jupyter notebook will guide you through this process and the resulting directory structure is explained below (directory structure). For more information, see the setup toffy walkthrough.

2. Setting up a MIBI run

For large MIBI runs, it is often convenient to automatically generate the JSON file containing the individual FOVs. There are two notebooks for this task, one for large tiled regions, the second for TMAs. If you will be tiling multiple adjacent FOVs together into a single image, the tiling notebook can automate this process. You provide the location of the top corner of the tiled region, along with the number of fovs along the rows and columns, and it will automatically create the appropriate JSON file.

The second notebook is for TMAs. This notebook is run after you have selected the appropriate cores from the TMA. It will generate an overlay with the image of the TMA and the locations you picked to ensure you selected the correct cores. It will then check that they are named correctly and that there are no duplicates.

For more information, see the MIBI tiling walkthrough.

3. Evaluating a MIBI run

There are a number of different computational tasks to complete once a MIBI run has finished to ensure everything went smoothly.

  • 3a: real time monitoring. The MIBI monitoring notebook will monitor an ongoing MIBI run, and begin processing the image data as soon as it is generated. This notebook is being continually be updated as we move more of our processing pipeline to happen in real time as the data is generated. For more information, see the real-time monitoring walkthrough.
  • 3b - 3e: post-run monitoring. For each step in the monitoring notebook, we have a dedicated notebook that can perform the same tasks once a run is complete. For more information, see the image processing and extraction walkthrough.

4. Processing MIBI data

Once your run has finished, you can begin to process the data to make it ready for analysis. To remove background signal contamination, as well as compensate for channel crosstalk, you can use the compensation notebook. This will guide you through the Rosetta algorithm, which uses a flow-cytometry style compensation approach to remove spurious signal.

Following compensation, you will want to normalize your images to ensure consistent intensity across the run. You can use the normalization notebook to perform this step.

5. Formatting MIBI runs for analysis

After the image processing and cleanup from toffy is complete, the final step is to format your data to faciliate easy downstream analysis. The reorganization notebook will walk you through the process of renaming FOVs, combining partial runs, and consolidating your images. For more information, see the reorganizing your data walkthrough.

Pipeline Flowchart

flow-chart

Installation

In order to get toffy working, you'll need to first install some dependencies and the repository itself. For more information, see the toffy setup walkthrough

Requirements for specific operating systems

The process of setting up is largely the same for different operating systems. However, there are a few key differences.

Windows

  • You must have C++ Build Tools (VS19) installed. Go to https://visualstudio.microsoft.com/visual-cpp-build-tools/ and click 'Download Build Tools'. Open the installer and make sure you are installing the package labeled C++ build tools, then follow the prompts.

    • (If installing on CAC, you will need the admin password and must contact [email protected])
    • Git - CAC: We highly recommend installing git system-wide on the CAC, by downloading the installation utility here.
      1. Under Standalone Installer, click the 64-bit Git for Windows Setup link to download the proper installer.
      2. Run the Git setup .exe file. It should be version 2.37.1 or higher.
      3. Click Yes to allow Git to make the necessary changes.
      4. Click Next to accept the GNU License.
      5. Click Next to save Git in it's default location. alt text
      6. Next, the installer will give you a list of options for each menu. Leave everything to it's default. We recommend to not change anything, unless you are confident on what you are doing.
      7. The last menu, will ask if you would like to use any experimental options. Click Install, and leave the experimental options unchecked. This will now install Git. alt text.
      8. Open the Windows Terminal, and within the Powershell Tab type Git and hit enter. If you see the following output, you're good to go! alt text
  • You will need the latest version of Anaconda (Miniconda preferred). Download here: https://docs.conda.io/en/latest/miniconda.html and select the appropriate download for your system. Choose "Just Me" option for installation, and do not need to select the "Tutorial" or "Getting Started" options. Continue with the installation.

macOS

  • You will need the latest version of Anaconda (Miniconda preferred). Download here: https://docs.conda.io/en/latest/miniconda.html and select the appropriate download for your system. Choose "Just Me" option for installation, and do not need to select the "Tutorial" or "Getting Started" options. Continue with the installation.

Setting up the virtual environment

  • For Windows, you will need open the Anaconda powershell prompt instead of the regular powershell prompt for the following.

  • If macOS user, open terminal.

If you do not already have git installed, run

conda install git

Navigate to the desired location (ex: Documents) and clone the repo.

cd .\Documents\
git clone https://github.com/angelolab/toffy.git

Move into directory and build the environment

cd toffy
conda env create -f environment.yml

This creates a Python 3.8 environment named toffy_env. You can view everything that gets installed by looking at the environment.yml file.

Using the repo

Once you're ready to use the repo, enter the following commands.

First, activate the environment:

conda activate toffy_env

Once activated, notebooks can be used via this command for Windows:

start_jupyter.sh

or this command for macOS:

./start_jupyter.sh

You can leave the jupyter notebook running once you're done. If it ever gets closed or you need to reopen it, just follow the steps above.

Updating the repo

The toffy repo is constantly being updated. In order to get those changes to your version, you'll need to tell git to update with the following command:

git pull

After performing the above command, you will sometimes need to update your environment:

conda remove --name toffy_env --all
conda env create -f environment.yml

To update the notebooks, run this command for Windows:

start_jupyter.sh -u

or this command for macOS:

./start_jupyter.sh -u

Directory structure

Data from each run on the MIBI will be stored in the default base directory D:\\Data, in a subdirectory labeled with the run name. The set up jupyter notebook creates the following folders that will be used throughout toffy.

Four new folders are created on the D drive:

Directories in D drive

D directories

Within C:\\Users\\Customer.ION\\Documents are directories that store necessary files used to set up and monitor a MIBI run.

  • normalization_curve: directory which stores the normalization curve file for the machine that was produced by the set up notebook and necessary for notebook 4b
  • tiled_image_jsons: stores all files used to set up a tiled run in the tiling notebook
  • autolabeled_tma_jsons: stores all files used to set up a tma run in the tma notebook
  • panel_files: directory containing the run panel file, needed for notebooks 3a, 3b, 4a, and 4b.
  • watcher_logs: contains the log file of FOVs which have been processed in the monitoring notebook
  • run_metrics: contains the data files produced by the QC and MPH notebooks
  • rosetta_matrices: directory containing the finalized compensation matrix generated in the compensation notebook
  • rosetta_testing: directory which stores the necessary files for and output of rosetta testing completed in the compensation notebook
Directories in C drive

C directories



You can see below how to pin a folder to Quick Access, which can then be easily located in the section of the same name on the left side of File Explorer.

We suggest pinning the following folders: tiled_image_jsons, autolabeled_jsons, run_metrics.

Quick Access

Panel format

Many of the scripts in toffy require a panel file, identifying which targets have been put on which masses. You can download your panel online from the Ionpath MibiTracker under the resources tab. In the panels section, open your panel and click Download csv.

panel download

You should then copy the file to the C:\\Users\\Customer.ION\\Documents\panel_files directory, and rename the file to be descriptive of your run. The toffy notebooks expect the panel files to be formatted slightly differently than the Ionpath default. The first time your panel is read in to one of the notebooks, it will be automatically modified by our scripts to contain the necessary information for toffy processing. This includes adding additional channels which are used for compensation, a full list of which can be found in the example panel file.

Median Pulse Height

The median pulse height (MPH) provides a way to assess the sensitivity of the detector, independent of the specific sample being acquired. It uses characteristics of the output from the detector itself to determine what fraction of maximum sensitivity the instrument is currently running at. We use this fraction of the maximum sensitivity to determine 1) when the detector needs to be swept again and 2) how much to normalize our images by after the fact the correct for this change in sensitivity. The minimum MPH required to still have acceptable signal will depend in part on the markers in your panel, the settings of the instrument, and other factors. However, we often find that the miniumum is somewhere between 5,000 and 6,000 MPH.

Questions?

If you have a general question or are having trouble with part of the repo, head to the discussions tab to get help. If you've found a bug with the codebase, first make sure there's not already an open issue, and if not, you can then open an issue describing the bug.

Before opening, please double check and see that someone else hasn't opened an issue for your question already.