- Armingol E., Ghaddar A., Joshi C.J., Baghdassarian H., Shamie I., Chan J., Her H.L., Berhanu S., Dar A., Rodriguez-Armstrong F., Yang O., O’Rourke E.J., Lewis N.E. Inferring a spatial code of cell-cell interactions across a whole animal body. PLOS Computational Biology 18(11): e1010715, (2022). DOI: 10.1371/journal.pcbi.1010715
All the analyses in this repository can be run on a CodeOcean capsule for reproducible results (estimated running time: 3:30 hr): https://doi.org/10.24433/CO.4688840.v2
If you are interested in running everything locally, follow the instructions below.
Follow this tutorial to install Anaconda or Miniconda
Follow this tutorial to install Github
Create a new conda environment:
conda create -n cell2cell -y python=3.7 jupyter
Activate that environment:
conda activate cell2cell
Then, install all dependencies:
pip install numba
pip install umap-learn
pip install 'matplotlib==3.2.0'
pip install 'cell2cell==0.6.0'
pip install git+https://github.com/BubaVV/Pyevolve
pip install tissue_enrichment_analysis
pip install matplotlib_venn
pip install 'xgboost==1.6.2'
If the environment cell2cell is not active, activate it:
conda activate cell2cell
Then, for the respective analyses, a jupyter notebook is provided. Otherwise, instructions are detailed.
Jupyter notebooks can be open by executing the following command from the folder of this repository:
jupyter notebook
- Analyses have to be run in the order below (although we provided the results of each step, so analyses can be run skipping previous steps) :
-
Generate list of ligands-receptors interactions from orthologs. Then, a manual-curation is needed. This step can be skipped since we provided a manual curated list
-
Compute intercellular distances and classify cell pairs by ranges of distances.
-
Compute cell-cell interactions and communication for the curated ligand-receptor interactions.
-
Run the genetic algorithm to select important ligand-receptor pairs for obtaining a better correlation between CCI scores and intercellular distance:
- From the main directory of this repository, run:
python ./code/genetic_algorithm.py -s bray_curtis -o GA-Bray-Curtis -r 100 -c 10
*Note: This step can take between 1-2 days, depending on the number of iterations assigned in the nested for loops in the .py file. By default, it runs 100 times the GA, distributed in 10 cores (change -c 10 for another number of cores).
This step can be skipped since we provided the results of 100 runs of the genetic algorithm using the Bray-Curtis score
- From the main directory of this repository, run:
-
Compute cell-cell interactions and communication for the GA-selected ligand-receptor interactions.
-
Perform permutation analyses on GA-selected ligand-receptor pairs.
-
Evaluate enrichment of phenotypes on the genes in the GA-selected list of ligand-receptor pairs.
-
Generate UMAP plots based on Jaccard distance of pairs of cells given active LR pairs.
-
Run a similar analysis to the one in step 4, but this time using the LR Count score as the CCI score:
- From the main directory of this repository, run:
python ./Notebooks/genetic_algorithm.py -s count -o GA-LR-Count -r 100 -c 10
*Note: This step can take between 1-2 days, depending on the number of iterations assigned in the nested for loops in the .py file. By default, it runs 100 times the GA, distributed in 10 cores (change -c 10 for another number of cores).
This step can be skipped since we provided the results of 100 runs of the genetic algorithm using the LR count score
- From the main directory of this repository, run:
-
Run a similar analysis to the one in step 4, but this time using the ICELLNET score as the CCI score:
- From the main directory of this repository, run:
python ./Notebooks/genetic_algorithm.py -s icellnet -o GA-ICELLNET -r 100 -c 10
*Note: This step can take between 1-2 days, depending on the number of iterations assigned in the nested for loops in the .py file. By default, it runs 100 times the GA, distributed in 10 cores (change -c 10 for another number of cores).
This step can be skipped since we provided the results of 100 runs of the genetic algorithm using the ICELLNET score
- From the main directory of this repository, run:
-
Compare GA-based selection of LR pairs by using Bray-Curtis score, LR Count score, or ICELLNET score
-
Analyze spatial properties associated to the location type of each LR pair
- Generate CCI scores from Bray-Curtis, LR Count, and Smillie scoring functions
- Generate CCI scores from ICELLNET scoring function
- Generate CCI scores from CellChat scoring function
- Benchmarking of threshold values for binary-based methods
- Benchmarking of all CCI-scores - Classifiers for distinguishing distance range between cells
Disclaimer: Figures from the jupyter notebooks may differ from those in the paper, depending on the installed versions of the dependencies of the respective analyses. The same might happen with certain results that depends on external tools. For ensuring the figures look the same, use the CodeOcean capsule instead.