Adding PyTorch support in data_utils #79
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In this pull request, I add the option of using PyTorch for the
data_utils
class. This should allow users to get out a PyTorch dataset via theload_ncdata_with_generator
function.The following changes occurred:
data_utils.py
where one can choose anml_backend
(currently either PyTorch or Tensorflow, defaulting to Tensorflow) which will be used inload_ncdata_with_generator
and thus elsewhere in the class.setup.py
where one can now set an environment variable to install PyTorch over Tensorflow. Note that one can runsetup.py
naively as usual and it will install Tensorflow.testing_data_utils_with_backends.py
which is a small script to test that one can indeed use the backends correctly and that one can still save things to NumPy arrays.I was also informed that a new testing framework is coming -- I am happy to do another PR at that time with proper testing. For now, I have tested the logic in
data_utils.py
with the scripttesting_data_utils_with_backends.py
(including a comparison of output arrays) and the changes tosetup.py
via the use of different virtual environments.