Collate and SingleCells to accept less/different compartments #301
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
I'm proposing these changes to Collate and SingleCells for those to accept different numbers of cell compartments or even just one compartment.
Related to issue #272
This is still a work in progress.
So far, (1) I've added the option on collate.py to accept 3 flags, for no-cells, no-cytoplasm, or no-nuclei; (2) the checking on
assert_linking_cols_complete
only happens when there's more than one compartment.Proposed changes/discussion
From some discussion with @bethac07, we saw two options:
The first option is just to add more documentation to SingleCells (which I did), and expect the user to build their dictionary and provide it as
compartment_linking_cols
. If that's how you'd like to do it, I think the changes I already did would be enough to merge.OR build the dictionary for
compartment_linking_cols
based on the compartments given:Now: the
compartment_linking_cols
is defined as thedefault_linking_cols
if no dictionary is specified by the user. Also, for SingleCells to work right now with only one compartment, you must give a dictionary linking the compartment to itself, for example:{ "cells": {"cells": "ObjectNumber"}}
, which works, but I don't know if that's the right way to do it.So, to build the dictionary:
{"compA": {"parent":{"compB":"object_Parent_compB"}, "child":{"compA":"ObjectNumber"}}}
, where it will take the user compartments and create this dictionary. I just don't know which is the best way to handle this, because how do we know which is the child-parent from objects other than nuclei, cell, and cytoplasm, if the user is not giving that info?par_child_dict = {"nuclei": {"cells", "cytoplasm"}, "cells":{"cytoplasm"}}
from where the relationship can be inferred and thecompartment_linking_cols
is built based on par_child_dict.sc_df
is being merged to work with only one compartment without the need to provide a dictionary that links the compartment to itself.What is the nature of your change?
Checklist