You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently UncoverML is controlled by a YAML that gets read into a Config object that has various key: value pairs set as attributes.
This object has gotten pretty complex and there's a lot of dependencies between the attributes. It's also the biggest cause of tests breaking - new attributes get added or attributes get modified and then in the code they no longer exist on the config object in certain execution paths where they were previously being read/checked. It would be great to streamline this.
YAML is also an issue. YAML is really easy to make mistakes with. It's very syntax sensitive and small typos can lead to confusing errors. It also makes it hard to verify that the user has provided the correct values for the desired workflow. And the biggest issue (in my opinion) is that parameter name typos aren't handled. The user might think they've provided an optional parameter to activate a feature, but the key is misspelled. So when parsing the YAML file (by looking up parameters based on keys) that parameter won't get set and the related processing won't occur, but if it doesn't cause any errors (often in the case of optional features/parameters) the user won't realise.
Another concern is that the Config object contains state - it owns the FeatureSet and TransformSet objects. These contain the paths to covariate data and covariate statistics that are used for applying transforms.
I've been considering the Python module route. That is, have a config.py module the user is expected to modify. The parameter names are baked in as attributes so there's no concern about parameter name typos. It also gets around a lot of YAML's annoying syntax issues. However I'm open to any solutions that keep things simple and solve the mentioned issues.
This is a laborious task as just about everything in UncoverML touches the Config object. It also means extracting the stateful FeatureSet and TransformSet.
The text was updated successfully, but these errors were encountered:
A thought would be a gui...or website that makes config files, too, or at least the important skeletons, config.py things you can automate of course good, too
Currently UncoverML is controlled by a YAML that gets read into a Config object that has various key: value pairs set as attributes.
This object has gotten pretty complex and there's a lot of dependencies between the attributes. It's also the biggest cause of tests breaking - new attributes get added or attributes get modified and then in the code they no longer exist on the config object in certain execution paths where they were previously being read/checked. It would be great to streamline this.
YAML is also an issue. YAML is really easy to make mistakes with. It's very syntax sensitive and small typos can lead to confusing errors. It also makes it hard to verify that the user has provided the correct values for the desired workflow. And the biggest issue (in my opinion) is that parameter name typos aren't handled. The user might think they've provided an optional parameter to activate a feature, but the key is misspelled. So when parsing the YAML file (by looking up parameters based on keys) that parameter won't get set and the related processing won't occur, but if it doesn't cause any errors (often in the case of optional features/parameters) the user won't realise.
Another concern is that the Config object contains state - it owns the
FeatureSet
andTransformSet
objects. These contain the paths to covariate data and covariate statistics that are used for applying transforms.I've been considering the Python module route. That is, have a
config.py
module the user is expected to modify. The parameter names are baked in as attributes so there's no concern about parameter name typos. It also gets around a lot of YAML's annoying syntax issues. However I'm open to any solutions that keep things simple and solve the mentioned issues.This is a laborious task as just about everything in UncoverML touches the Config object. It also means extracting the stateful
FeatureSet
andTransformSet
.The text was updated successfully, but these errors were encountered: