- Metadata
- Images
- Extracting single-cell features using CellProfiler
- Processing the profiles
- How to run the analyses?
Table of contents generated with markdown-toc
Link to the biorxiv manuscript: Not yet available
This is a dataset of images and profiles generated as a part of the JUMP Cell Painting (JUMP-CP) project. Genes were either over-expressed (ORF) or knocked out (CRISPR) and the cells were assayed using an imaging assay called Cell Painting. From the images, features were extracted using the CellProfiler software. The features were then processed and the resulting profiles were the analyzed using notebooks in this repository.
In the following sections, instructions are provided for downloading the various components of this dataset, processing the dataset and analyzing the profiles.
Metadata information, such as, which plate from which batch contains a particular gene, is available in the datasets repo.
Cell images are available for download from the cellpainting gallery public AWS S3 bucket.
There are two sources
of data. ORF images are from source_4
and CRISPR images are from source_13
.
source=<SOURCE NAME>
aws s3 sync \
--no-sign-request \
s3://cellpainting-gallery/cpg0016-jump/${source}/images/ .
Features were extracted using the CellProfiler pipeline in https://github.com/broadinstitute/imaging-platform-pipelines/tree/master/JUMP_production#production-pipelines.
Instructions for creating the single-cell profiles from images are provided in the Image-based profiling handbook.
Single-cell profiles can be downloaded from the cellpainting gallery public AWS S3 bucket.
source=<SOURCE NAME>
batch=<BATCH NAME>
plate=<PLATE NAME>
aws s3 sync \
--no-sign-request \
s3://cellpainting-gallery/cpg0016-jump/${source}/workspace/backend/${batch}/${plate}/ --exclude "*" --include "*.sqlite" .
Well-level profiles are also created using the instructions provided in the Image-based profiling handbook.
Well-level profiles can also be downloaded from the cellpainting gallery public AWS S3 bucket.
source=<SOURCE NAME>
batch=<BATCH NAME>
plate=<PLATE NAME>
aws s3 sync \
--no-sign-request \
s3://cellpainting-gallery/cpg0016-jump/${source}/workspace/profiles/${batch}/${plate}/ .
Various steps were performed to remove technical noise from the profiles. These steps are as follows:
- Well position correction
- Cell count regression
- Normalization
- Outlier removal
- Feature selection
- Sphering
- Harmony correction
These steps can be performed using the jump-profiling-recipe and the appropriate config file (orf.json
and crispr.json
from the input
folder).
The processed profiles are stored in the cellpainting-gallery bucket.
To download/clone this repository, run the following commands
git clone https://github.com/jump-cellpainting/2024_Chandrasekaran_Morphmap.git
cd 2024_Chandrasekaran_Morphmap
git submodule update --init --recursive
To download the profiles and other files required to run the analyses in this repository, run the following commands
cd profiles
./download-profiles.sh
Notebooks in each folder are run to replicate the analyses. To reproduce the analyses, install the conda environment within each folder and run the notebooks.
Download and install mamba from miniforge using the appropriate installer for your operating system.
Once installed, run the following command to create the conda environment in each folder using the following commands
mamba env create -f environment.yml
mamba activate <conda environment name>