This tool allows the user to compare groups of data specified in a tidy pandas dataframe with ease.
In this repo the capabilities of the PairwiseCompareManager are shown through examples using a dataset from the nf1_schwann_cell_painting_data.
These examples can be found in the docs.
Although, most of the development efforts can be found in the src
folder.
Users should almost exclusively interact with the PairwiseCompareManager, however, there may be rare exceptions.
If you choose to interact with another component of the tool, then there will be less input validation safeguards available.
In Python 3.10 and 3.11 you can install this tool with:
pip install git+https://github.com/WayScience/pairwise_compare.git
Note it is highly recommended to use a package manager such as Conda to install this tool.
Once installed you can use subtools such as the PairwiseCompareManager
with:
from comparison_tools.PairwiseCompareManager import PairwiseCompareManager
When passing arguments to the PairwiseCompareManager
you can specify the columns that remain the same in each group-to-group comparison, and the columns that will be different in these comparisons.
These columns are parameterized by _same_columns
and _different_columns
, respectively.
The column values in these columns uniquely define each group.
During pairwise comparisons of groups, all of the column values of the columns specified in _same_columns
will be the same between both groups compared for all paired combinations of groups.
Likewise, all of the column values of the columns specified in _different_columns
will be different between both groups compared for all paired combinations of groups.
One of the following column arguments conditions must be satisified when using the PairwiseCompareManager
:
_same_columns
must include at least one list element if_different_columns
has less than two list elements._different_columns
must contain one or more list elements._same_columns
and_different_columns
should not contain any of the same columns.
- Input validation is enforced in the
PairwiseCompareManager
and PairwiseCompare classes. - Additional column values are not tracked aside from columns used to compare groups (_same_columns, _different_columns).
- Output and input python data structures are limited.
- All of the data, in the supplied pandas dataframe, is used to compute comparisons.
This tool compares features between any two groups of a tidy pandas dataframe.
To incorporate additional comparators for making comparisons, you must introduce the functionality as a class and inherit from the Comparator
class (See Comparator.py for details).