-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimum and maximum value arguments (constraints) #9
Comments
Thanks @ThirstyGeo for raising this issue -- completely agree that it would be a really useful feature. The best way to implement this is probably to allow users to change the activation functions for specific output nodes in the network -- then the model will incorporate this range trimming within training itself. We will look into this as a priority, and if you had any further suggestions/pull requests they'd be greatly received. |
That's great @tsrobinson! Much appreciated to focus on this. I'll think a bit more through the typical workflows and see if I can create a which represents a typical situation. If you like it, it could be something for the package's examples/tutorials |
As a tangent of interest - few research articles are present which relate to imputation of data in the compositional data Simplex. The best one I'm aware of for Deep Learning oriented research for imputing compositional data relates to the specific case of 'censored zeroes', i.e., the values which are below analytical detection and above zero (the only information usual given is that the values are below a certain threshold). The article focusses on ANNs, and has a focus on feature pre-processing (using log-ratio transformations on the features, to move them out of the Simplex and into Euclidean space). The autoencoder approach of MIDASpy has the significant potential advantages of (1) allowing mixed data types, (2) not requiring a pre-processing step, (3) producing multiple realisations and therefore a measure of confidence for imputed values. Very exciting! |
Really interesting - thanks @ThirstyGeo for letting us know about this research. |
Hello and thank you for this great package. I wanted to inquire whether you have had any progress on this issue? We have a data set with a lot of count data variables, and many of them get imputed with negative values, which isn't ideal. Hence, our interest :) |
Any news on this? Or maybe a small idea on how or where this would fit best in the code if i were to toy around with it myself? :) |
Hi @geraldine28 @kblnig, we are looking into this now and will get back to you shortly. Sorry about the delay! |
@ranjitlall - really looking forward to this :) !!!! |
Echoing others' enthusiasm, I'm also wondering if there's any news on this feature |
Looking forward to this feature! |
Thanks everyone for your interest! I can confirm this is now under development, and will update you asap when this functionality is ready for release. |
Hello ! I saw that you added this new feature but when I try to call .build_model with the argument positive_columns(), Python tells me it does not exist. Is it still available or have you removed it ? Thanks |
Hi @CoralieGilbert, it's still available but we haven't released to PyPI -- i will try to action this by the end of the week and let you know when it's done. Best, |
Thank you so much ! Best, |
All done, @CoralieGilbert! You should be able to Just to note, there are some tensorflow incompatibilities with the new numpy 2.X versions, so if you cannot install/load this new version, try downgrading numpy to 1.26.4 and try again. Any other problems, just let me know :) |
I'm working with Dirichlet distributions and the compositional data simplex, and am really enjoying MIDASpy's flexibility when dealing with this data (related to K-L divergence in the decoder). However, there is a tendency to produce negative values in the numerical feature data I have been using.
In the case of compositional data, there is a constraint of zero as a minimum value. Other imputation approaches allow setting maximum and minimum value arguments (e.g., Scikit-Learn) and importantly these can be set per feature (autoimpute). Is this an argument which could be added to the package? It would be a major help to people working in several disciplines.
The text was updated successfully, but these errors were encountered: