You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The storage is organized across multiple tiers. The distinguishing characteristics for the tiers are:
speed (throughput and latency),
size,
accessibility (temporal and locational persistency), and
robustness (redundancy and back-ups).
Usually
speed is inversely proportional to size, robustness, and accessibility, and
size, robustness, and accessibility are proportional to each other.
Only low speed storage (i.e. the Isilon NFS mount) will be accessible to all clusters in the future. Thus, Isilon will become crucial in the future in maintaining uniform data access across all clusters.
File systems accessible through the HPC Infiniband network
The HPC file systems are meant to store working data, and are not meant for long term storage. The scratch file system and project directories store large temporary input/output files, the home directory is meant for working storage, and then we have local file systems accessible through /tmp (local persistent memory) and /dev/shm (virtual memory) that are fast, available in jobs, and wiped out when the job finishes. Finally project storage is meant to store finalized input and output files.
However, there are file systems that are accessible through slower network connections and offer different kinds of features.
File systems not accessible through Infiniband
The central university storage is slower, but snapshotted and backed up much more regularly. Therefore users should transfer their data to the central systems for long tern storage.
However, there are multiple options of accessing the central university storage. There are the systems Atlas, Ebenezer, Isilon-DMZi, and Isilon-DMZe.
What is the difference between Atlas, Ebenezer, and Isilon?
What is the difference between Isilon-DMZi and Isilon-DMZe?
How are user quota managed in central storage systems, and how can users see the usage limits?
There are 2 central storage servers to Hyacithe's knowledge, which are operated by the SIU, the "isilon-prod" and "isilon-drs" (off site replica of "isilon-prod", in case of disaster on "isilon-prod").
The isilon-prod is split in (at least) two zones:
the SIU zone, that accessed using SMB via atlas.uni.lux, and
the HPC zone, that is mounted in the clusters with NFS and can be accessed in /mnt/isilon.
The processes for the HPC zone are not well defined or documented. We can set up quota per project directory, but there's no way to show this information to the users. We are working on providing users with access to this information and setting up a policy for assigning quota.
We share the Isilon system with the SIU. There is a "fair use agreement" in place which allocated 2PB for the HPC zone, currently used at 88% of the full capacity. Maintaining access to the Isilon system is important moving forward, as the Isilon file system will be the only system unifying data access across our future clusters. We should participate in any future calls and coordinate with SIU.
In terms of performance, performance is abysmal with small random I/Os, for instance small files, metadata, etc. The Isilon NFS mount works well for administrative needs, like archiving and occasional data transfers, and even for big file I/O. But don't try to perform any compute driven operation on NFS mounted Isilon, like compile a software on it, or anything similar.
The Atlas file system
The SMB protocol allows for easy mounting of file systems on personal computers, including Windows machines.
The HPC team is not managing the file system exported through SMB from Atlas (atlas.uni.lux). However, the HPC team maintains the smb-storage script (under active development) that allows mounting SMB shares on the login nodes of our clusters.
Fun fact: you can access the HPC zone via samba on your workstation using your Active Directory credentials. This works via a fragile script to map windows/POSIX permissions and user accounts from the HPC-IPA to the SIU Active Directory. This was requested by LCSB Bio-core in 2014. The system still works but it is no longer supported. Honestly, if you are using linux you can get the performance of SMB with SSHFS: https://blog.ja-ke.tech/2019/08/27/nas-performance-sshfs-nfs-smb.html
Add some instruction on how to fix errors in access permissions
The discussion of data management is a bit unorganized. We should probably reorganize the sections and add some information on how users can fix their projects when errors occur.
In the future make sure that files and directories your create in projects have the correct permission. Remember, in project directories the quota are computed per project group (covalux in your case). Cluster users (clusterusers) have 0 quota in the project directory, so any complaints about insufficient storage may also be caused by incorrect user groups.
gkaf89
changed the title
organize the storage on the cluster
Organize the storage on the cluster
Aug 11, 2024
Storage tiers
The storage is organized across multiple tiers. The distinguishing characteristics for the tiers are:
Usually
Only low speed storage (i.e. the Isilon NFS mount) will be accessible to all clusters in the future. Thus, Isilon will become crucial in the future in maintaining uniform data access across all clusters.
File systems accessible through the HPC Infiniband network
The HPC file systems are meant to store working data, and are not meant for long term storage. The scratch file system and project directories store large temporary input/output files, the home directory is meant for working storage, and then we have local file systems accessible through
/tmp
(local persistent memory) and/dev/shm
(virtual memory) that are fast, available in jobs, and wiped out when the job finishes. Finally project storage is meant to store finalized input and output files.However, there are file systems that are accessible through slower network connections and offer different kinds of features.
File systems not accessible through Infiniband
The central university storage is slower, but snapshotted and backed up much more regularly. Therefore users should transfer their data to the central systems for long tern storage.
However, there are multiple options of accessing the central university storage. There are the systems Atlas, Ebenezer, Isilon-DMZi, and Isilon-DMZe.
The Isilon file system
Isilon is actually the name of the technical solution: https://www.dell.com/fr-fr/dt/storage/isilon/isilon-h5600-hybrid-nas-storage.htm#scroll=off
There are 2 central storage servers to Hyacithe's knowledge, which are operated by the SIU, the "isilon-prod" and "isilon-drs" (off site replica of "isilon-prod", in case of disaster on "isilon-prod").
The isilon-prod is split in (at least) two zones:
atlas.uni.lux
, and/mnt/isilon
.For the HPC side, we are on;y interested about the NFS mounted file system. Documentation about Isilon: https://hpc-git.uni.lu/ulhpc/sysadmins/-/wikis/storage/isilon
The processes for the HPC zone are not well defined or documented. We can set up quota per project directory, but there's no way to show this information to the users. We are working on providing users with access to this information and setting up a policy for assigning quota.
We share the Isilon system with the SIU. There is a "fair use agreement" in place which allocated 2PB for the HPC zone, currently used at 88% of the full capacity. Maintaining access to the Isilon system is important moving forward, as the Isilon file system will be the only system unifying data access across our future clusters. We should participate in any future calls and coordinate with SIU.
In terms of performance, performance is abysmal with small random I/Os, for instance small files, metadata, etc. The Isilon NFS mount works well for administrative needs, like archiving and occasional data transfers, and even for big file I/O. But don't try to perform any compute driven operation on NFS mounted Isilon, like compile a software on it, or anything similar.
The Atlas file system
The SMB protocol allows for easy mounting of file systems on personal computers, including Windows machines.
The HPC team is not managing the file system exported through SMB from Atlas (
atlas.uni.lux
). However, the HPC team maintains thesmb-storage
script (under active development) that allows mounting SMB shares on the login nodes of our clusters.Fun fact: you can access the HPC zone via samba on your workstation using your Active Directory credentials. This works via a fragile script to map windows/POSIX permissions and user accounts from the HPC-IPA to the SIU Active Directory. This was requested by LCSB Bio-core in 2014. The system still works but it is no longer supported. Honestly, if you are using linux you can get the performance of SMB with SSHFS: https://blog.ja-ke.tech/2019/08/27/nas-performance-sshfs-nfs-smb.html
Add some instruction on how to fix errors in access permissions
The discussion of data management is a bit unorganized. We should probably reorganize the sections and add some information on how users can fix their projects when errors occur.
To fix access permissions in a project directory,
Also, add a link with more resources: https://www.redhat.com/sysadmin/suid-sgid-sticky-bit
The text was updated successfully, but these errors were encountered: