AWS ParallelCluster v3.7.0
dreambeyondorange
released this
30 Aug 12:11
·
19 commits
to release-3.7
since this release
We're excited to announce the release of AWS ParallelCluster 3.7.0
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
ENHANCEMENTS
- Add support for Ubuntu 22. RSA keys are not supported by default. See this page.
- Add support for login nodes.
- Add support to mount existing Amazon File Cache as shared storage.
- Allow configuration of static and dynamic node priorities in Slurm compute resources via the ParallelCluster configuration YAML file.
- Add a queue-level parameter (
JobExclusiveAllocation
) to ensure nodes in the partition are exclusively allocated to a single job at any given time. - Allow overriding the aws-parallelcluster-node package at cluster creation and update time (only on the head node during update). Useful for development purposes only.
- Allow memory-based scheduling when multiple instance types are specified for a Slurm Compute Resource.
- Avoid starting the NFS server on compute nodes.
CHANGES
- Deprecate Ubuntu 18.
- Upgrade Slurm to version 23.02.4.
- Update the default root volume size to 40 GB to account for limits on Centos 7.
- Upgrade NVIDIA driver to version 535.54.03.
- Upgrade CUDA library to version 12.2.0.
- Upgrade NVIDIA Fabric manager to nvidia-fabricmanager-535.
- Upgrade NICE DCV to version 2023.0-15487.
- server: 2023.0.15487-1
- xdcv: 2023.0.551-1
- gl: 2023.0.1039-1
- web_viewer: 2023.0.15487-1
- Upgrade EFA installer to 1.25.1.
- Efa-driver: efa-2.5.0-1
- Efa-config: efa-config-1.15-1
- Efa-profile: efa-profile-1.5-1
- Libfabric-aws: libfabric-aws-1.18.1-1
- Rdma-core: rdma-core-46.0-1
- Open MPI: openmpi40-aws-4.1.5-4
- Upgrade ARM PL to version 23.04.1 for Ubuntu 22.04 only.
- Assign Slurm dynamic nodes a priority (weight) of 1000 by default. This allows Slurm to prioritize idle static nodes over idle dynamic ones.
- Change the default value of
Imds/ImdsSupport
from v1.0 to v2.0. - Make
aws-parallelcluster-node
daemons handle only ParallelCluster-managed Slurm partitions. - Create a Slurm
partition-nodelist
mapping JSON file to be used by the node package daemons to recognize PC-managed Slurm partitions and nodelists. - Increase EFS-utils watchdog poll interval to 10 seconds. Note: This change is meaningful only if EncryptionInTransit is set to true, because watchdog does not run otherwise.
BUG FIXES
- Add validation to
ScaledownIdletime
value, to prevent setting a value lower than-1
. - Fix issue causing dangling IAM policies to be created when creating ParallelCluster CloudFormation custom resource provider with CustomLambdaRole.
- Fix an issue that was causing misalignment of compute nodes DNS name on instances with multiple network interfaces,
when usingSlurmSettings/Dns/UseEc2Hostnames
equals toTrue
. - Fix cluster creation failure with Ubuntu Deep Learning AMI on GPU instances and DCV enabled.