Add a transformer for min/max normalization #863

npatki · 2024-08-09T14:25:51Z

Problem Description

As indicated in this issue, some users have found that applying a min/max scaling significantly improved the synthetic data quality.

However, the RDT library currently does not offer min/max scaling. It only offers the GaussianNormalizer(which uses the z-score), and ClusterBasedNormalizer which uses Bayesian GMMs.

Expected behavior

Min/max scaling will need to learn the min and max values during the fit stage. When transforming, it will take the entire distribution and transform it into the range [0,1] by using the formula: (value - min)/(max - min). Finally, the reverse transform will expand values back into the original [min, max] range, ensuring that out-of-bounds values are clipped.

Additional context

This is a tracking issue. The exact API (incl transformer name, parameters, etc.) still need to be figured out.

The text was updated successfully, but these errors were encountered:

npatki added feature request Request for a new feature feature:transformer Related to adding a new transformer labels Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a transformer for min/max normalization #863

Add a transformer for min/max normalization #863

npatki commented Aug 9, 2024

Add a transformer for min/max normalization #863

Add a transformer for min/max normalization #863

Comments

npatki commented Aug 9, 2024

Problem Description

Expected behavior

Additional context