-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the scMultipleX Wiki!
scMultipleX is a software package for feature extraction of microscopy imaging data. It provides workflows for feature extraction of segmentated objects (e.g. organoids) and single cells, and for linking of objects and cells over multiplexing rounds. It supports 2D and 3D imaging data, and single-round or multiplexed experiments. scMultipleX uses Prefect (v1.4) for parallelized processing, and assumes input data pre-proprecessed with Drogon.
The workflow consists of the following tasks:
- Task 0 Build Experiment: Initialize output data storage structure with FAIM-HCS (v0.1.1)
- Task 1 Feature Extraction: Perform 2D object-level and 3D single-cell-level feature extraction and nuclear to membrane linking
- Task 2 Organoid Multiplex: Link objects across multiplexing rounds
- Task 3 Nuclear Multiplex: Link nuclei within objects across multiplexing rounds
- Task 4 Aggregate Features: Output measured features for each round and objects type (e.g. organoids, nuclei, membranes)
- Task 5 Combine Nuclear and Membrane Features: Output combined nuclear and membrane features based on nuclear to membrane linking
- Task 6 Aggregate Organoid Multiplex: Output measured object features across multiplexing rounds
- Task 7 Aggregate Nuclear Multiplex: Output measured nuclear features across multiplexing rounds
Create new conda environment with python 3.9:
conda create -n scmpx python=3.9
Activate the conda environment:
conda activate scmpx
Install scMultipleX:
pip install 'scmultiplex @ git+https://github.com/fmi-basel/gliberal-scMultipleX.git#scmultiplex[plotting]'
- SSH or Remote Desktop to a virtual machine.
Note to Windows users: use power shell, or Putty. Don't forget to start a tmux / GNU screen session before starting a long analysis.
- Create an output directory (ex. username/scMultiplex-demo-test)
mkdir /Users/MY_SAVE_DIR
- Copy demo config file (demo.ini) to own user folder
cp -t /Users/MY_SAVE_DIR /Code/Common/Repositories/gliberal-scMultipleX/resources/scMultipleX_testdata/demo.ini
- Check that demo.ini is copied over:
ls /Users/MY_SAVE_DIR
You should see the demo.ini file listed in directory
- Edit this config file:
- You can edit the file via your local mapped drive or via Remote Desktop to a workstation
- Navigate to demo.ini and open it in your favorite text editor (e.g. Notepad++)
- Change
base_dir_save
to MY_SAVE_DIR path, save.
- Back to the virtual machine, create a symbolic link for scMultipleX in your home directory bin folder. This link is persistent and needs to be created only the first time running scMultipleX on a given machine:
cd $HOME
mkdir -p bin
ln -s -t bin /Code/Common/Repositories/gliberal-scMultipleX/run_scmultiplex
ls -l bin
- Run scMultipleX on test dataset
run_scmultiplex --help
Let's run each task one by one:
run_scmultiplex --cpus 10 --config /Users/MY_SAVE_DIR/demo.ini --tasks 0
run_scmultiplex --cpus 10 --config /Users/MY_SAVE_DIR/demo.ini --tasks 1
etc...
Or we can run multiple tasks at once:
run_scmultiplex --cpus 10 --config /Users/MY_SAVE_DIR/demo.ini --tasks 0 1 2 3 4 5 6 7
- Check output folder!
General parameters for initializing FAIM-HCS experiment structure
well_pattern = Regex pattern for recognizing well ID
raw_ch_pattern = Regex pattern for recognizing channel ID in raw image files
mask_ending = Suffix of organoid segmentation image
base_dir_raw = Path to raw data directory (folder contains rounds)
base_dir_save = Path to save directory
spacing = Z,Y,X pixel spacing of region-extracted data in um/pix, comma-separated
overview_spacing = Y,X pixel spacing of well overview images, comma-separated
round_names = Names of multiplexing rounds, comma-separated
Round-specific parameters for initializing FAIM-HCS experiment structure. Include this subsection for each round and update name, e.g. round_R1
name = Round name
nuc_ending = Suffix of nuclear segmentation image
mem_ending = Suffix of membrane segmentation image
root_dir = Path to raw data directory for this round
fname_barcode_index = Number of underscores in Yokogawa barcode, integer
organoid_seg_channel = Image channel used for organoid segmentation, e.g. C01
nuclear_seg_channel = Image channel used for nuclear segmentation, e.g. C01
membrane_seg_channel = Image channel used for membrane segmentation, e.g. C04
Parameters used during feature extraction
excluded_plates = Folder name of plate (e.g. day2,day3) to exclude from analysis, comma-separated
excluded_wells = Well ID to exclude from analysis (e.g. A01,C06), comma-separated
ovr_channel = Image channel used for organoid segmentation, e.g. C01
name_ovr = Naming of regionprops file; always keep as regionprops_ovr_
iop_cutoff = Float value 0 to 1 for cutoff threshold for calling a nucleus inside a membrane. Recommended value is 0.6
iop = number of pixels in intersection of membrane and nuclear label / number of pixels in nuclear label
Closer to 1 means better match
Parameters used during organoid linking
iou_cutoff = Float value 0 to 1 for cutoff threshold for matching RX to R0 object. Recommended value is 0.2
iou = number of pixels in intersection of R0 and RX object label / number of pixels in union of R0 and RX object label
Closer to 1 means better match
scMultipleX is installed at: /Code/Common/Repositories/gliberal-scMultipleX
and can be run on any Linux machine with this conda environment. See Demo Run section for more details.
Use run_scmultiplex --help
for details on arguments.
To run:
run_scmultiplex --cpus [NUM CORES, INT] --config [PATH TO .INI CONFIG] --tasks [TASKS TO RUN]
Note:
- --cpus default is number of cores available for the process on the machine
- --tasks available are integers 0 - 7
To run on CPU cluster a submission script needs to be created. A basic example that can be used as a start is located at:
/Code/Common/Repositories/gliberal-scMultipleX/clusterme.sh
- Copy clusterme.sh to your own folder, for example MY_SAVE_DIR
cp -t /Users/MY_SAVE_DIR /Code/Common/Repositories/gliberal-scMultipleX/clusterme.sh
- Edit clusterme.sh and change the
config_path
and the--job-name
options with your favorite editor - SSH to the cluster
- navigate to folder containing clusterme.sh using the
cd
command
cd /Users/MY_SAVE_DIR
- run
sbatch clusterme.sh
- x_pos_pix: x-centroid
- y_pos_pix: y-centroid
- z_pos_scaled: z-centroid scaled by the z-spacing anisotropy
- z_pos_img: z-centroid without scaling, matches spacing of raw and label image
- surface_area: surface area of object in pixels, calculated with marching cube algorithm
- volume_pix: volume of object, taking into account z-spacing anisotropy