You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As indicated in this issue, some users have found that applying a min/max scaling significantly improved the synthetic data quality.
However, the RDT library currently does not offer min/max scaling. It only offers the GaussianNormalizer(which uses the z-score), and ClusterBasedNormalizer which uses Bayesian GMMs.
Expected behavior
Min/max scaling will need to learn the min and max values during the fit stage. When transforming, it will take the entire distribution and transform it into the range [0,1] by using the formula: (value - min)/(max - min). Finally, the reverse transform will expand values back into the original [min, max] range, ensuring that out-of-bounds values are clipped.
Additional context
This is a tracking issue. The exact API (incl transformer name, parameters, etc.) still need to be figured out.
The text was updated successfully, but these errors were encountered:
Problem Description
As indicated in this issue, some users have found that applying a min/max scaling significantly improved the synthetic data quality.
However, the RDT library currently does not offer min/max scaling. It only offers the GaussianNormalizer(which uses the z-score), and ClusterBasedNormalizer which uses Bayesian GMMs.
Expected behavior
Min/max scaling will need to learn the min and max values during the fit stage. When transforming, it will take the entire distribution and transform it into the range [0,1] by using the formula:
(value - min)/(max - min)
. Finally, the reverse transform will expand values back into the original [min, max] range, ensuring that out-of-bounds values are clipped.Additional context
This is a tracking issue. The exact API (incl transformer name, parameters, etc.) still need to be figured out.
The text was updated successfully, but these errors were encountered: