Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Methods for bulk, spatial and merged objects #58

Merged
merged 8 commits into from
Mar 4, 2024

Conversation

allyhawkins
Copy link
Member

Closes #41
Closes #43

This PR adds the methods for processing bulk and spatial data and then also creating the merged object. I was unsure about the level of detail for the merged workflow, since there's not much too the actual merging other than we combine things. We could mention that we use cbind, etc, but I didn't think that was totally necessary. I'm curious what others think.

Also, please check to make sure I'm not missing any important references that we should include.

Tagging @jashapiro for review of bulk and spatial and @sjspielman for review of merged objects.

Copy link

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 498d232.

Manuscript build

content/04.methods.md Outdated Show resolved Hide resolved
content/04.methods.md Outdated Show resolved Hide resolved
content/04.methods.md Outdated Show resolved Hide resolved
These genes are used to calculate library-aware principal components with `batchelor::multiBatchPCA()`.
The top 50 principal components were selected and used to calculate UMAP embeddings for the merged object.

If any libraries included in the ScPCA project contain additional ADT data, the ADT data is also merged and stored in the `altExp` slot of the merged `SingleCellExperiment` object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want to say that if any libraries in the project don't have ADT, they will still be there but with NA values? Is that overkill?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like that might be overkill. I think the point here is that if there was ADT data to begin with it will also be present in the merged object.

content/04.methods.md Outdated Show resolved Hide resolved
content/04.methods.md Outdated Show resolved Hide resolved
Co-authored-by: Stephanie Spielman <[email protected]>
Copy link

github-actions bot commented Mar 1, 2024

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit e68081e.

Manuscript build

Copy link
Member

@sjspielman sjspielman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

While you're here though, can you fix (or try to fix...) these two citations that I found in the HTML build?

  1. The STAR reference is missing a colon in @doi:

    Bulk RNA-seq reads for each sample were mapped to a reference genome using `STAR` [@doi10.1093/bioinformatics/bts635] and multiplexed single-cell or single-nuclei RNA-seq reads were mapped to the same reference genome using `STARsolo`[@doi:10.1101/2021.05.05.442755].

  2. Honestly not sure why this miQC reference doesn't render properly, but if you can figure it out feel free to fix! But it'll get fixed eventually either way.

    Then, low-quality cells are identified and removed with `miQC` [@doi:10.1371/journal.pcbi.1009290], which jointly models the proportion of mitochondrial reads and detected genes per cell and calculates a probability that each cell is compromised.

Copy link

github-actions bot commented Mar 4, 2024

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 1e76ac8.

Manuscript build

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall content here looks good. I had a few minor order changes, and thoughts about when we want to delve into details of implementation vs. just presenting the process. But nothing I need to see again for now, as I am sure these will come up in round-robin.

content/04.methods.md Outdated Show resolved Hide resolved
content/04.methods.md Outdated Show resolved Hide resolved
content/04.methods.md Outdated Show resolved Hide resolved
- combining counts data and metadata

Merged objects are created with the `merge.nf` workflow within `scpca-nf`.
This workflow takes as input the processed `SingleCellExperiment` objects in a given ScPCA project output by `scpca-nf` and creates a single merged `SingleCellExperiment` object containing gene expression data and metadata from all libraries in that project.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the fact that the SCEs come from scpca-nf is implied, and removing that clause makes the sentence flow a bit more smoothly.

Suggested change
This workflow takes as input the processed `SingleCellExperiment` objects in a given ScPCA project output by `scpca-nf` and creates a single merged `SingleCellExperiment` object containing gene expression data and metadata from all libraries in that project.
This workflow takes as input the processed `SingleCellExperiment` objects in a given ScPCA project and creates a single merged `SingleCellExperiment` object containing gene expression data and metadata from all libraries in that project.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that it's necessarily implied so I think for methods purposes it's better to be specific, even if it makes the sentence a little clunkier.

content/04.methods.md Outdated Show resolved Hide resolved
content/04.methods.md Outdated Show resolved Hide resolved
Copy link

github-actions bot commented Mar 4, 2024

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit f021fec.

Manuscript build

Copy link

github-actions bot commented Mar 4, 2024

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit 2779d59.

Manuscript build

Copy link

github-actions bot commented Mar 4, 2024

Click the link below to download the manuscript build as a ZIP file.
This build is associated with commit e6c2794.

Manuscript build

@allyhawkins allyhawkins merged commit 4d14dc1 into main Mar 4, 2024
1 check passed
@allyhawkins allyhawkins deleted the allyhawkins/spatial-bulk-merged-methods branch March 4, 2024 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generating merged data Bulk and spatial methods
3 participants