-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
353 additions
and
78 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2593,6 +2593,124 @@ | |
##' | ||
"guise2024" | ||
|
||
####---- petrosius2023_mES ----#### | ||
|
||
##' Petrosius et al, 2023 (Nat. Comm.): Mouse embryonic stem cell (mESC) in | ||
##' different culture conditions | ||
##' | ||
##' @description | ||
##' Profiling mouse embryonic stem cells across ground-state (m2i) and | ||
##' differentiation-permissive (m15) culture conditions. The data were | ||
##' acquired using orbitrap-based data-independent acquisition (DIA). | ||
##' The objective was to demonstrate the capability of their approach | ||
##' by profiling mouse embryonic stem cell culture conditions, showcasing | ||
##' heterogeneity in global proteomes, and highlighting differences in | ||
##' the expression of key metabolic enzymes in distinct cell subclusters. | ||
##' | ||
##' @format A [QFeatures] object with 605 assays, each assay being a | ||
##' [SingleCellExperiment] object: | ||
##' | ||
##' - Assay 1-603: PSM data acquired with an orbitrap-based data-independent | ||
##' acquisition (DIA) protocol, hence those assays contain single column | ||
##' that contains the quantitative information. | ||
##' - `peptides`: peptide data containing quantitative data for 9884 | ||
##' peptides and 603 single-cells. | ||
##' - `proteins`: protein data containing quantitative data for 4270 | ||
##' proteins and 603 single-cells. | ||
##' | ||
##' Sample annotation is stored in `colData(petrosius2023_mES())`. | ||
##' | ||
##' @section Acquisition protocol: | ||
##' | ||
##' The data were acquired using the following setup. More information | ||
##' can be found in the source article (see `References`). | ||
##' | ||
##' - **Sample isolation**: Cell sorting was done on a Sony MA900 cell sorter | ||
##' using a 130 μm sorting chip. Cells were sorted at single-cell resolution, | ||
##' into a 384-well Eppendorf LoBind PCR plate (Eppendorf AG) containing 1 μL | ||
##' of lysis buffer. | ||
##' - **Sample preparation**: Single-cell protein lysates were digested with | ||
##' 2 ng of Trypsin (Sigma cat. Nr. T6567) supplied in 1 μL of digestion | ||
##' buffer (100mM TEAB pH 8.5, 1:5000 (v/v) benzonase (Sigma cat. Nr. E1014)). | ||
##' The digestion was carried out overnight at 37 °C, and subsequently | ||
##' acidified by the addition of 1 μL 1% (v/v) trifluoroacetic acid (TFA). | ||
##' All liquid dispensing was done using an I-DOT One instrument (Dispendix). | ||
##' - **Liquid chromatography**: The Evosep one liquid chromatography system was | ||
##' used for DIA isolation window survey and HRMS1-DIA experiments.The standard | ||
##' 31 min or 58min pre-defined Whisper gradients were used, where peptide | ||
##' elution is carried out with 100 nl/min flow rate. A 15 cm × 75 μm | ||
##' ID column (PepSep) with 1.9 μm C18 beads (Dr. Maisch, Germany) and a 10 | ||
##' μm ID silica electrospray emitter (PepSep) was used. Both LC systems were | ||
##' coupled online to an orbitrap Eclipse TribridMass Spectrometer | ||
##' (ThermoFisher Scientific) via an EasySpray ion source connected to a | ||
##' FAIMSPro device. | ||
##' - **Mass spectrometry**: The mass spectrometer was operated in positive | ||
##' mode with the FAIMSPro interface compensation voltage set to −45 V. | ||
##' MS1 scans were carried out at 120,000 resolution with an automatic gain | ||
##' control (AGC) of 300% and maximum injection time set to auto. For the DIA | ||
##' isolation window survey a scan range of 500–900 was used and 400–1000 | ||
##' rest of the experiments. Higher energy collisional dissociation (HCD) was | ||
##' used for precursor fragmentation with a normalized collision energy (NCE) | ||
##' of 33% and MS2 scan AGC target was set to 1000%. | ||
##' - **Raw data processing**: The mESC raw data files were processed with | ||
##' Spectronaut 17 and protein abundance tables exported and analyzed further | ||
##' with python. | ||
##' | ||
##' @section Data collection: | ||
##' | ||
##' The data were provided by the Author and is accessible at the [Dataverse] | ||
##' (https://dataverse.uclouvain.be/dataset.xhtml?persistentId=doi:10.14428/DVN/EMAVLT) | ||
##' The folder ('20240205_111248_mESC_SNEcombine_m15-m2i/') contains the | ||
##' following files of interest: | ||
##' | ||
##' - `20240205_111251_PEPQuant (Normal).tsv`: the PSM level data | ||
##' - `20240205_111251_Peptide Quant (Normal).tsv`: the peptide level data | ||
##' - `20240205_111251_PGQuant (Normal).tsv`: the protein level data | ||
##' | ||
##' The metadata were downloaded from the [Zenodo | ||
##' repository] (https://zenodo.org/records/8146605). | ||
##' | ||
##' - `sample_facs.csv`: the metadata | ||
##' | ||
##' We formatted the quantification table so that columns match with the | ||
##' metadata. Then, both tables are then combined in a single | ||
##' [QFeatures] object using the [scp::readSCP()] function. | ||
##' | ||
##' The peptide data were formated to a [SingleCellExperiment] object and the | ||
##' sample metadata were matched to the column names and stored in the `colData`. | ||
##' The object is then added to the [QFeatures] object and the rows of the PSM | ||
##' data are linked to the rows of the peptide data based on the peptide sequence | ||
##' information through an `AssayLink` object. | ||
##' | ||
##' The protein data were formated to a [SingleCellExperiment] object and | ||
##' the sample metadata were matched to the column names and stored in the | ||
##' `colData`. The object is then added to the [QFeatures] object and the rows | ||
##' of the peptide data are linked to the rows of the protein data based on the | ||
##' protein sequence information through an `AssayLink` object. | ||
##' | ||
##' @source | ||
##' The peptide and protein data can be downloaded from the [Dataverse] | ||
##' (https://dataverse.uclouvain.be/dataset.xhtml?persistentId=doi:10.14428/DVN/EMAVLT) | ||
##' The raw data and the quantification data can also be found in the | ||
##' MassIVE repository `MSV000092429`: | ||
##' ftp://[email protected]/. | ||
##' | ||
##' @references | ||
##' **Source article**: Petrosius, V., Aragon-Fernandez, P., Üresin, N. et al. | ||
##' "Exploration of cell state heterogeneity using single-cell proteomics | ||
##' through sensitivity-tailored data-independent acquisition." | ||
##' Nat Commun 14, 5910 (2023). | ||
##' ([link to article](https://doi.org/10.1038/s41467-023-41602-1)). | ||
##' | ||
##' @examples | ||
##' \donttest{ | ||
##' petrosius2023_mES() | ||
##' } | ||
##' | ||
##' @keywords datasets | ||
##' | ||
"petrosius2023_mES" | ||
|
||
####---- petrosius2023_AML ----#### | ||
|
||
##' Petrosius et al. 2023 (bioRxiv): AML hierarchy on Astral. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,4 +23,5 @@ | |
"gregoire2023_mixCTRL","Single-cell proteomics data from two monocyte cell lines","3.19",NA,"TXT","https://www.ebi.ac.uk/pride/archive/projects/PXD046211",NA,"Homo sapiens",9606,TRUE,"PRIDE","Samuel Gregoire <[email protected]>","QFeatures","Rda","scpdata/gregoire2023_mixCTRL.Rda",2024-01-22,119,"Sage","TMT-16",TRUE,TRUE,TRUE,TRUE,NA | ||
"khan2023","Single-cell proteomics data of 421 MCF-10A cells undergoing EMT triggered by TGFβ","3.19",NA,"TXT","https://drive.google.com/drive/folders/1zCsRKWNQuAz5msxx0DfjDrIe6pUjqQmj",NA,"Homo sapiens",9606,TRUE,"MassIVE","Enes Sefa Ayar <[email protected]>","QFeatures","Rda","scpdata/khan2023.Rda",2023-12-21,47,"MaxQuant","TMTPro 16plex",TRUE,TRUE,TRUE,TRUE,NA | ||
"guise2024","Single-cell proteomics data of 108 postmortem CTL or ALS spinal moto neurons","3.19",NA,"TXT","ftp://massive.ucsd.edu/v05/MSV000092119/",NA,"Homo sapiens",9606,TRUE,"MassIVE","Christophe Vanderaa <[email protected]>","QFeatures","Rda","scpdata/guise2024.rda",2024-01-05,47,"Proteome Discoverer","LFQ",TRUE,TRUE,TRUE,TRUE,NA | ||
"petrosius2023_AML","Single-cell proteomics data of 4 cell types from the OCI-AML8227 model.","3.19",NA,"TXT","https://dataverse.uclouvain.be/dataset.xhtml?persistentId=doi:10.14428/DVN/EMAVLT",NA,"Homo sapiens",9606,TRUE,"Dataverse","Samuel Gregoire <[email protected]>","QFeatures","Rda","scpdata/petrosius2023.Rda",2023-06-08,217,"Spectronaut","LFQ",TRUE,TRUE,TRUE,TRUE,NA | ||
"petrosius2023_mES","Mouse embryonic stem cells across ground-state (m2i) and differentiation-permissive (m15) culture conditions.","3.19",NA,"TXT","https://dataverse.uclouvain.be/dataset.xhtml?persistentId=doi:10.14428/DVN/EMAVLT",NA,"Homo sapiens",9606,TRUE,"Dataverse","Enes Sefa Ayar <[email protected]>","QFeatures","Rda","scpdata/petrosius2023_mES.Rda",2024-04-09,605,"Spectronaut","LFQ",TRUE,TRUE,TRUE,TRUE,NA | ||
"petrosius2023_AML","Single-cell proteomics data of 4 cell types from the OCI-AML8227 model.","3.19",NA,"TXT","https://dataverse.uclouvain.be/dataset.xhtml?persistentId=doi:10.14428/DVN/4DSPJM",NA,"Homo sapiens",9606,TRUE,"Dataverse","Samuel Gregoire <[email protected]>","QFeatures","Rda","scpdata/petrosius2023.Rda",2023-06-08,217,"Spectronaut","LFQ",TRUE,TRUE,TRUE,TRUE,NA |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
|
||
####---- Petrosius et al, 2023 ---#### | ||
|
||
|
||
## Petrosius, V., Aragon-Fernandez, P., Üresin, N. et al. Exploration of cell | ||
## state heterogeneity using single-cell proteomics through sensitivity-tailored | ||
## data-independent acquisition. Nat Commun 14, 5910 (2023). | ||
## https://doi.org/10.1038/s41467-023-41602-1 | ||
|
||
library(SingleCellExperiment) | ||
library(scp) | ||
library(tidyverse) | ||
|
||
####---- Load PSM data ----#### | ||
## The PSM data downloaded from the https://dataverse.uclouvain.be/dataset.xhtml?persistentId=doi:10.14428/DVN/EMAVLT | ||
## and 'sample_facs.csv' from the https://zenodo.org/records/8146605 | ||
## '20240205_111251_PEPQuant (Normal).tsv' = contains the PSM data. | ||
## 'sample_facs.csv' = contains the cell annotations. | ||
|
||
root <- "~/localdata/SCP/petrosiusmESC/20240205_111248_mESC_SNEcombine_m15-m2i/" | ||
ev <- read.delim(paste0(root, "20240205_111251_PEPQuant (Normal).tsv")) | ||
design <- read.delim(paste0(root, "sample_facs.csv")) | ||
|
||
####---- Create sample annotation ----#### | ||
design %>% | ||
select(-X) %>% | ||
distinct() %>% | ||
add_column(Channel = "PEP.Quantity") %>% | ||
rename(Set = File.Name, | ||
SampleType = Plate) -> | ||
meta | ||
|
||
## Clean quantitative data | ||
ev %>% | ||
rename(Set = R.FileName, | ||
protein = PG.ProteinAccessions) %>% | ||
## Create a modified sequence + charge variable | ||
mutate(peptide = paste0("_", PEP.StrippedSequence, "_.", FG.Charge)) %>% | ||
filter(Set %in% meta$Set) -> | ||
evproc | ||
|
||
## Create the QFeatures object | ||
petrosius2023_mES <- readSCP(evproc, | ||
meta, | ||
channelCol = "Channel", | ||
batchCol = "Set", | ||
removeEmptyCols = TRUE) | ||
|
||
|
||
####---- Peptide data ----#### | ||
## The peptide data downloaded from the https://dataverse.uclouvain.be/dataset.xhtml?persistentId=doi:10.14428/DVN/EMAVLT | ||
## '20240205_111251_Peptide Quant (Normal).tsv' contains the peptide data. | ||
|
||
## Load the peptide level quantification data | ||
pep_data <- read.delim(paste0(root, "20240205_111251_Peptide Quant (Normal).tsv")) | ||
|
||
## Clean quantitative data | ||
pep_data %>% | ||
pivot_wider(names_from = R.FileName, | ||
values_from = PG.Quantity, | ||
id_cols = c(EG.PrecursorId, PG.ProteinAccessions)) -> | ||
peps | ||
|
||
## Create the SingleCellExperiment object | ||
pep <- readSingleCellExperiment(peps, | ||
ecol = 3:605) | ||
|
||
## Name rows with peptide sequence | ||
rownames(pep) <- peps$EG.PrecursorId | ||
|
||
## Rename columns so they math with the PSM data | ||
colnames(pep) %>% | ||
paste0("PEP.Quantity") -> | ||
colnames(pep) | ||
|
||
## Include the peptide data in the QFeatures object | ||
petrosius2023_mES <- addAssay(petrosius2023_mES, pep, name = "peptides") | ||
|
||
## Link the PSMs and the peptides | ||
petrosius2023_mES <- addAssayLink(petrosius2023_mES, | ||
from = 1:603, | ||
to = "peptides", | ||
varFrom = rep("EG.PrecursorId", 603), | ||
varTo = "EG.PrecursorId") | ||
|
||
|
||
####---- Add the protein data ----#### | ||
## The peptide data downloaded from the https://dataverse.uclouvain.be/dataset.xhtml?persistentId=doi:10.14428/DVN/EMAVLT | ||
## '20240205_111251_PGQuant (Normal).tsv' contains the protein data. | ||
|
||
prot_data <- read.delim(paste0(root, "20240205_111251_PGQuant (Normal).tsv")) | ||
|
||
## Clean quantitative data | ||
prot_data %>% | ||
mutate(R.FileName = sub(".*rawfiles/", "", R.Raw.File.Name)) %>% | ||
mutate(R.FileName = sub(".raw", "", R.FileName)) %>% | ||
pivot_wider(names_from = R.FileName, | ||
values_from = PG.Quantity, | ||
id_cols = PG.ProteinAccessions) -> | ||
prots | ||
|
||
## Create the SingleCellExperiment object | ||
pro <- readSingleCellExperiment(prots, | ||
ecol = 2:604) | ||
|
||
## Name rows with peptide sequence | ||
rownames(pro) <- prots$PG.ProteinAccessions | ||
|
||
## Rename columns so they math with the PSM data | ||
colnames(pro) %>% | ||
paste0("PEP.Quantity") -> | ||
colnames(pro) | ||
|
||
## Include the peptide data in the QFeatures object | ||
petrosius2023_mES <- addAssay(petrosius2023_mES, pro, name = "proteins") | ||
|
||
## Link the PSMs and the peptides | ||
petrosius2023_mES <- addAssayLink(petrosius2023_mES, | ||
from = "peptides", | ||
to = "proteins", | ||
varFrom = "PG.ProteinAccessions", | ||
varTo = "PG.ProteinAccessions") | ||
|
||
## Save data | ||
save(petrosius2023_mES, | ||
file = file.path(paste0(root, "petrosius2023_mES.Rda")), | ||
compress = "xz", | ||
compression_level = 9) | ||
|
Oops, something went wrong.