So you want to develop a new metric?: Read this first! #121
Locked
npatki
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The SDMetrics library provides a set of tools for evaluating synthetic data. We're excited to foster an open source community of users for contributing metrics for the various different uses of synthetic data.
We welcome new ideas for metrics! If you want to contribute your metric to the library, read through this discussion and then reach out to the SDV maintainers by filing an issue or messaging us on Slack.
Defining a metric
All metrics in this library are model-agnostic. Anyone who wants to use your metric should already have:
Base Usage
The base version of your metric takes in real and synthetic data with the smallest possible unit of data. This could be a column, a pair of columns, a table, etc. For example, if your metric is computing pairwise correlations, the base unit is a pair of columns.
The base metric is a class with a
compute
method. The method takes in the unit of real data, synthetic data and any other keyword args you want to add. It returns a score, represented as floating point value.Aggregate Usage
In many cases, you may want to iterate through the entire dataset to apply the base metric to different columns, pairs of columns, tables, etc. You can write a convenience method called
compute_breakdown
that performs this iteration.This method takes in the full real data, synthetic data, keyword args and metadata. According to the metadata, you can determine when to apply the base metric. The metric returns a dictionary of results, keyed by the base unit.
Think through this before implementing a metric
We are always aiming to improve the feature set and usability of SDMetrics. Before implementing your metric, think through the questions below.
1. Description
2. Basic Information
3. Usage
4. Interpretation
Next Steps
The questions in this discussion should give you a clearer idea of how your metric should work. You can file an issue or reach out to the SDV team on Slack to discuss the next steps of implementing your metric.
Beta Was this translation helpful? Give feedback.
All reactions