You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently an experiment must specify the bind_mounts option in order to reuse existing files (like datasets) in the host. However, each agent must also have the same copy of files in the path specified by host_path to guarentee the same bahavior for each experiment. This is painful and problematic when the number of agents increases, even if NFS service is deployed to ensure data consistency e.g. bind-mounting a subdirectory of an existing NFS share seems to raise a permission problem.
The root cause is that bind_mounts option cannot specify the storage driver. In addition, many modern cloud storage solutions like HDFS of Hadoop and custom Object Storage of AWS, GCP, Azure, Alibaba must use a custom driver to be mounted as normal file storage in the container. The docker volume provides such solutions via the --mount option. For example, we can specify to use the NFS volume driver in the command line like docker run --mount 'type=volume,src=<VOLUME-NAME>,dst=<CONTAINER-PATH>,volume-driver=local,volume-opt=type=nfs,volume-opt=device=<nfs-server>:<nfs-path>,"volume-opt=o=addr=<nfs-address>,vers=4,soft,timeo=180,bg,tcp,rw"' <image> <command>. In this case, we no longer need to manually mount the NFS share in each agent or download files in each experiment or modify existing DataLoaders. Furthermore, it's also possible to use an existing cloud storage service like normal files, in which the Quality of Service is instead managed by the storage driver.
Describe the solution you'd like
Have not read the source code of this project, but I'm guessing the config is translated into raw docker run commands? In this case simply adding the --mount translation should work.
Describe alternatives you've considered
The biggest problem to me is how to easily use existing cloud storage service. So a specific solution for a common cloud storage service like NFS is also acceptable.
Describe the problem
Currently an experiment must specify the
bind_mounts
option in order to reuse existing files (like datasets) in the host. However, each agent must also have the same copy of files in the path specified byhost_path
to guarentee the same bahavior for each experiment. This is painful and problematic when the number of agents increases, even if NFS service is deployed to ensure data consistency e.g. bind-mounting a subdirectory of an existing NFS share seems to raise a permission problem.The root cause is that
bind_mounts
option cannot specify the storage driver. In addition, many modern cloud storage solutions like HDFS of Hadoop and custom Object Storage of AWS, GCP, Azure, Alibaba must use a custom driver to be mounted as normal file storage in the container. The docker volume provides such solutions via the--mount
option. For example, we can specify to use the NFS volume driver in the command line likedocker run --mount 'type=volume,src=<VOLUME-NAME>,dst=<CONTAINER-PATH>,volume-driver=local,volume-opt=type=nfs,volume-opt=device=<nfs-server>:<nfs-path>,"volume-opt=o=addr=<nfs-address>,vers=4,soft,timeo=180,bg,tcp,rw"' <image> <command>
. In this case, we no longer need to manually mount the NFS share in each agent or download files in each experiment or modify existing DataLoaders. Furthermore, it's also possible to use an existing cloud storage service like normal files, in which the Quality of Service is instead managed by the storage driver.Describe the solution you'd like
Have not read the source code of this project, but I'm guessing the config is translated into raw docker run commands? In this case simply adding the
--mount
translation should work.Describe alternatives you've considered
The biggest problem to me is how to easily use existing cloud storage service. So a specific solution for a common cloud storage service like NFS is also acceptable.
Additional context
Also add the option to mount
tmpfs
?The text was updated successfully, but these errors were encountered: