Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASLDVS file_md5 seems broken #197

Closed
jrfinkbeiner opened this issue Jun 23, 2022 · 6 comments · Fixed by #198
Closed

ASLDVS file_md5 seems broken #197

jrfinkbeiner opened this issue Jun 23, 2022 · 6 comments · Fixed by #198

Comments

@jrfinkbeiner
Copy link
Contributor

Hi,

I just tried to download the ASL-DVS via tonic.datasets.ASLDVS.
However, the download fails as it complains about a wrong md5 signature. It expects "20f1dbf961f9a45179f6e489e93c8f2c" which is hardcoded into the class, but on two different systems I get "33f8b87bf45edc0bfed0de41822279b9" using tonic.download_utils.calculate_md5.

In both cases I am using a freshly installed tonic version via pip install tonic, tonic version '1.0.20'.

Running something this should give the result:

import tonic
try:
    tonic.datasets.ASLDVS(save_to="./")
except Exception as e:
    print(e)
actual_md = tonic.download_utils.calculate_md5("./ASLDVS/ASLDVS.zip")
print(actual_md)

It would be nice if somebody could check whether I am wrong, or whether the attribute file_md5 in the class tonic.datasets.ASLDVS is actually wrong.

Thanks,
Jan

@fabrizio-ottati
Copy link
Collaborator

Hi @jrfinkbeiner. I receive the same error when using Tonic to download it.
As you said, probably the MD5 signature has changed (is it possible, though?) and needs to be updated in the source code of the dataset class. Are you able to manually download the dataset from the internet and calculate its MD5 signature? The link should be this one.

@jrfinkbeiner
Copy link
Contributor Author

@fabhertz95 I did that, the manual download (via the link provided in the dataset class), and I get the same signature as mentioned above "33f8b87bf45edc0bfed0de41822279b9", which is different from the one in the class.

@fabrizio-ottati
Copy link
Collaborator

Nice! Well, you can open a PR that corrects the MD5 signature and passes all the tests, if you want. I would wait for @biphasic opinion but I think this is the best option.

@biphasic
Copy link
Member

thank you @jrfinkbeiner @fabhertz95. The hash is hardcoded to guard against corrupted files in any way, see https://github.com/pytorch/vision/blob/main/torchvision/datasets/mnist.py#L41 for an example in torchvision. The original data didn't change from what it looks like (although we wouldn't know if a file was deleted for example). Could be that the way Dropbox is zipping the folders changed slightly. In any case thanks for reporting it and please open a PR with the new hash. Thank you!

@jrfinkbeiner
Copy link
Contributor Author

Sounds good, will do.

@biphasic
Copy link
Member

mentioning issue #146 too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants