You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Specifically, my question stems from wanting to one hot encode the component part of the result of applying the ClusterBasedNormalizer? My custom generator doesn't deal well with non-one-hotted categoricals and I'd like to change the HyperTransformer / underlying one in an SDV Synthesizer as little as possible? Currently I am defining a second HyperTransformer with OneHotEncoders for all the columns that use ClusterBasedNormalizer and None for all other columns but this is a little tedious / feels suboptimal as a workflow. Looking at the code for the ClusterBasedNormalizer it could make sense to expose a choice to the user as to how components is encoded? Alternatively, making transformers composable through some framework could be a nice addition to the package.
Problem Description
Specifically, my question stems from wanting to one hot encode the component part of the result of applying the
ClusterBasedNormalizer
? My custom generator doesn't deal well with non-one-hotted categoricals and I'd like to change theHyperTransformer
/ underlying one in an SDVSynthesizer
as little as possible? Currently I am defining a secondHyperTransformer
withOneHotEncoder
s for all the columns that useClusterBasedNormalizer
andNone
for all other columns but this is a little tedious / feels suboptimal as a workflow. Looking at the code for theClusterBasedNormalizer
it could make sense to expose a choice to the user as to how components is encoded? Alternatively, making transformers composable through some framework could be a nice addition to the package.Expected behavior
One of:
cbn = ClusterBasedNormalizer(..., onehot = True, ...)
ht = HyperTransformer(); ht.update_transformers({"colname": [ClusterBasedNormalizer(...), OneHotEncoder()]})
o.e.The text was updated successfully, but these errors were encountered: