Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetch_data might not be working correctly for mini-Imagenet? #1

Open
GhassenJ opened this issue Mar 8, 2018 · 19 comments
Open

fetch_data might not be working correctly for mini-Imagenet? #1

GhassenJ opened this issue Mar 8, 2018 · 19 comments

Comments

@GhassenJ
Copy link

GhassenJ commented Mar 8, 2018

Hi,
I followed the instructions but the code for mini-imagenet keeps exiting because:
OSError: cannot identify image file <_io.BufferedReader name='data/miniimagenet/train/n04515003/n04515003_15351.JPEG'>
Most of this folder (n04515003) is empty images, however fetch_data seemed to exit normally without any error messages?
I checked how many empty files there were under mini-imagenet's subfolder and it seems to be 53186/59981 files?
Omniglot seems fine, however?
Any idea what could be wrong with the script or what I could be doing wrong?

@unixpickle
Copy link
Contributor

Thanks for reporting this. Are some of the ImageNet images valid? If so, is there a general pattern as to which ones are empty?

I'm not sure what could cause this. Perhaps the ImageNet server is doing some kind of rate-limiting. If so, it may be possible to modify the script to detect this and print an error.

I'd expect omniglot to be fine, since the omniglot download process is much simpler than that for Mini-ImageNet

@nattari
Copy link

nattari commented Apr 14, 2018

While trying to download the image from the list of imagenet url, some of the image-ids do not exist. Any particular reason for that? For eg. in test data, n01930112_10035 is not there in the list. I have used "List of all image URLs of Fall 2011 Release".

TIA

@unixpickle
Copy link
Contributor

@nattari some of the images are not in the 2011 release, since the dataset is from the 2012 release. That's why the download script extracts files from the 2012 tar file. If there is a better API for getting 2012 images, let me know.

@nattari
Copy link

nattari commented Apr 24, 2018

Hmm, I am using the images from 2011 release at the moment. Since, I am more interested in understanding the algorithm so I guess that would work too. In case I find out, I would definitely share.

Could you please tell me, what GPU configuration do you use for training mini-imagenet and how long does it take you to train?

@unixpickle
Copy link
Contributor

I used a single 1080 Ti for most of the experiments. For all the benchmarks, training takes less than a day. The exact time depends on the hyper-parameters and dataset you use.

@nattari
Copy link

nattari commented Jun 10, 2018

I started training on some other data of ImageNet. Everything works fine but I get this warning : "Possibly corrupt exif file". Training gets stuck after some iterations. Do you have any clue what could be the problem here? The only thing I change is data. Is it due to the warning ?

@unixpickle
Copy link
Contributor

Is it possible that some class directories are empty or don't contain enough samples? I think it's possible to hang the training loop if there aren't enough samples to create a mini-batch, since it keeps looping over the data forever hoping to create a whole mini-batch.

@nattari
Copy link

nattari commented Jun 10, 2018

I thought about it and made sure that all the class directories contain enough sample. So that doesn't seem to be the problem. What I fear atm is the warning! But not sure.

@unixpickle
Copy link
Contributor

Huh, interesting. It would be nice to know where the program is stuck. When you kill the process, does Python print out a stack trace? If not (e.g. if the hang is inside the TF graph), maybe it will be helpful to attach a debugger to the process and look at a backtrace that way.

@nattari
Copy link

nattari commented Jun 12, 2018

Believe it is stuck in TF graph, yes I am trying to debug now. But here is the screenshot in case you could find something fishy here.
screenshot from 2018-06-12 15-25-01

@nattari
Copy link

nattari commented Jun 14, 2018

I am observing very less GPU utilization for both Omniglot and MiniImageNet (~ 2-3% or even less). This shouldn't be the case, I believe?
Also, I am using 1080i and for MiniImageNet it is only utilzing ~500MB memory and doesn't change even if I change the batch size. Can you provide insights on this behaviour?
(I am using Python 3.6, Tensorflow 1.8 and Cuda 9.0)

TIA.

@unixpickle
Copy link
Contributor

@nattari at first, things will be slow because the training pipeline is still loading the images into memory and resizing them on the fly. After training has run for a little while, the images will all be cached in memory, and you should start to see higher GPU utilization.

@unixpickle
Copy link
Contributor

As for memory, I'm not entirely sure. If you're referring to GPU memory, I think TensorFlow allocates blocks of memory at once, so you might not see subtle changes. If Python memory, then this is expected, since Python's memory usage will be dominated by loading and caching images.

@lampardwk
Copy link

lampardwk commented Jun 17, 2019

@unixpickle I had the same problem with the incomplete miniimagenet data downloaded from fetch_data.sh,most of folder is empty images. Could you send me a complete data set?My email address is [email protected],thanks.

@Liuyubao
Copy link

Liuyubao commented Jul 2, 2019

@unixpickle So sorry to bother that had the same problem with the incomplete mini-imagenet data downloaded from fetch_data.sh, most of folder is empty images. Could you also send me a complete data set?My email address is [email protected], thanks a lot for your time and patience.

@eghouti
Copy link

eghouti commented Oct 25, 2019

Hello @unixpickle,

First I would like to thank you for your excellent work that helps me a lot in my research. I would like to ask you if I can have the mini-imagent dataset you used to run these experiments. My email address is [email protected]

Best regards,

Ghouthi

@ligeng0197
Copy link

Hi @unixpickle .

I find that MiniImageNet source url in fetch_script is already invalid and I tried to find another source on ImageNet website, and I do find one. (http://www.image-net.org/challenges/LSVRC/2012/dd31405981ef5f776aa17412e1f0c112/ILSVRC2012_img_train.tar).
However, after replacing the url in fetch_script and downloading images, I met empty images problem mentioned by others. I got 13 empty images in train and val datasets, and I decided to relpace them manually with same object images. Unfortunately, after replacing I still get stuck by (OSError: image file is truncated (26 bytes not processed)) when training. I believe its caused by some incomplete images in train dataset, but I am kind of tired to fix it by hand . So would you mind sharing the miniimagenet dataset to google drive or some other places we can download directly? Thx ahead.

P.S. when replacing the empty images, i find MiniImagenet taken here is a little different from what is taken in pytorch-MAML(https://github.com/dragen1860/MAML-Pytorch).

@asd81310
Copy link

Did someone get the correct mini-imagenet in this experiments? If you did, can send share the dataset to me? Thanks very much for your help. My email address is My email address is [email protected].

@XA23i
Copy link

XA23i commented Sep 20, 2022

you can follow the instructions at this link https://github.com/dragen1860/MAML-Pytorch.
Then modify miniimagenet.py line 53:
names = [f for f in os.listdir(self.dir_path) if f.endswith('.jpg')]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants