Skip to content

Commit

Permalink
Merge pull request #99 from libffcv/main
Browse files Browse the repository at this point in the history
Update no_jit_assert branch with bug fixes
  • Loading branch information
andrewilyas authored Jan 24, 2022
2 parents eb06acf + 1d94d23 commit d817cc2
Show file tree
Hide file tree
Showing 7 changed files with 51 additions and 10 deletions.
32 changes: 32 additions & 0 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
FROM pytorch/pytorch:latest

RUN apt-get update && apt-get install -y --no-install-recommends \
software-properties-common \
build-essential \
curl \
git \
ffmpeg

RUN conda create -n ffcv python=3.9 \
cupy \
pkg-config \
compilers \
libjpeg-turbo \
opencv \
pytorch \
torchvision \
cudatoolkit=11.3 \
numba -c pytorch -c conda-forge

RUN echo "source activate" >> ~/.bashrc
RUN echo "conda activate ffcv" >> ~/.bashrc

RUN git clone https://github.com/libffcv/ffcv.git

RUN conda run -n ffcv pip install ffcv

# To test:
# 1- build the Dockerfile (e.g. docker build -t ffcv .)
# 2- login to the docker container (e.g. docker run -it --gpus all ffcv bash)
# 3- cd ffcv/examples/cifar
# 4- bash train_cifar.sh
4 changes: 2 additions & 2 deletions docs/ffcv_examples/cifar10.rst
Original file line number Diff line number Diff line change
Expand Up @@ -106,8 +106,8 @@ For the model, we use a custom ResNet-9 architecture from `KakaoBrain <https://g
class Mul(ch.nn.Module):
def __init__(self, weight):
super(Mul, self).__init__()
self.weight = weight
super(Mul, self).__init__()
self.weight = weight
def forward(self, x): return x * self.weight
class Flatten(ch.nn.Module):
Expand Down
2 changes: 2 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ Install ``ffcv``:
conda activate ffcv
pip install ffcv
We also provide a `Dockerfile <https://github.com/libffcv/ffcv/blob/main/docker/Dockerfile>`_ that installs ``ffcv`` in few steps.


Introduction
------------
Expand Down
11 changes: 9 additions & 2 deletions docs/making_dataloaders.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,13 @@ takes an ``enum`` provided by :class:`ffcv.loader.OrderOption`:
# Memory-efficient but not truly random loading
# Speeds up loading over RANDOM when the whole dataset does not fit in RAM!
ORDERING = OrderOption.QUASI_RANDOM
.. note::
``order`` options require different amounts of RAM, thus should be used considering how much RAM available in a case-by-case basis.

- ``RANDOM`` requires RAM the most since it will have to cache the entire dataset to sample perfectly at random. If the available RAM is not enough, it will throw an exception.
- ``QUASI_RANDOM`` requires much less RAM than ``RANDOM``, but a bit more than ``SEQUENTIAL``, in order to cache a part of samples. It is used when the entire dataset can not fit RAM.
- ``SEQUENTIAL`` requires least RAM. It only keeps several samples loaded ahead of time used in incoming training iterations.

Pipelines
'''''''''
Expand Down Expand Up @@ -165,12 +172,12 @@ Other options

You can also specify the following additional options when constructing an :class:`ffcv.loader.Loader`:

- ``os_cache``: If True, the entire dataset is cached
- ``os_cache``: If ``True``, the OS automatically determines whether the dataset is held in memory or not, depending on available RAM. If ``False``, FFCV manages the caching, and the amount of RAM needed depends on ``order`` option.
- ``distributed``: For training on :ref:`multiple GPUs<Scenario: Multi-GPU training (1 model, multiple GPUs)>`
- ``seed``: Specify the random seed for batch ordering
- ``indices``: Provide indices to load a subset of the dataset
- ``custom_fields``: For specifying decoders for fields with custom encoders
- ``drop_last``: If True, drops the last non-full batch from each iteration
- ``drop_last``: If ``True``, drops the last non-full batch from each iteration
- ``batches_ahead``: Set the number of batches prepared in advance. Increasing it absorbs variation in processing time to make sure the training loop does not stall for too long to process batches. Decreasing it reduces RAM usage.
- ``recompile``: Recompile every iteration. Useful if you have transforms that change their behavior from epoch to epoch, for instance code that uses the shape as a compile time param. (But if they just change their memory usage, e.g., the resolution changes, it's not necessary.)

Expand Down
2 changes: 1 addition & 1 deletion docs/parameter_tuning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Scenario: Large scale datasets
If your dataset is too large to be cached on the machine we recommend:

- Use ``os_cache=False``. Since the data can't be cached, FFCV will have to read it over and over. Having FFCV take over the operating system for caching is beneficial as it knows in advance the which samples will be needed in the future and can load them ahead of time.
- For ``order``, we recommend using the ``QUASI_RANDOM`` traversal order if you need randomness but perfect uniform sampling isn't mission critical. This will optimize the order to minimize the reads on the underlying storage while maintaining very good randomness properties. If you have experience with the ``shuffle()`` function of ``webdataset`` and the quality of the randomness wasn't sufficient, we still suggest you give ``QUASI_RANDOM`` a try as it should be significantly better.
- For ``order``, we recommend using the ``QUASI_RANDOM`` traversal order if you need randomness but perfect uniform sampling isn't mission critical. This will optimize the order to minimize the reads on the underlying storage while maintaining very good randomness properties. If you have experience with the ``shuffle()`` function of ``webdataset`` and the quality of the randomness wasn't sufficient, we still suggest you give ``QUASI_RANDOM`` a try as it should be significantly better. Using ``RANDOM`` is unfeasible in this situation because it needs to load the entire dataset in RAM, causing an out-of-memory exception.


Scenario: Multi-GPU training (1 model, multiple GPUs)
Expand Down
8 changes: 4 additions & 4 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,15 @@ PyTorch datasets and `WebDatasets <https://github.com/webdataset/webdataset>`_):
# Pass a type for each data field
writer = DatasetWriter(write_path, {
# Tune options to optimize dataset size, throughput at train-time
'image': RGBImageField({
'image': RGBImageField(
max_resolution=256,
jpeg_quality=jpeg_quality
}),
),
'label': IntField()
})
# Write dataset
writer.from_indexed_dataset(ds)
writer.from_indexed_dataset(my_dataset)
Then replace your old loader with the `ffcv` loader at train time (in PyTorch,
no other changes required!):
Expand Down Expand Up @@ -58,4 +58,4 @@ no other changes required!):
for epoch in range(epochs):
...
See :ref:`here <Getting started>` for a more detailed guide to deploying `ffcv` for your dataset.
See :ref:`here <Getting started>` for a more detailed guide to deploying `ffcv` for your dataset.
2 changes: 1 addition & 1 deletion docs/writing_datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ returns an input vector and its corresponding label:
self.Y = np.randn(N)
def __getitem__(self, idx):
return (self.X[idx], self.Y[idx])
return (self.X[idx].astype('float32'), self.Y[idx])
def __len__(self):
return len(self.X)
Expand Down

0 comments on commit d817cc2

Please sign in to comment.