Skip to content

Releases: aws/aws-parallelcluster

AWS ParallelCluster v3.1.2

02 Mar 14:40
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.1.2

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

CHANGES

  • Upgrade Slurm to version 21.08.6.

BUG FIXES

  • Fix the update of /etc/hosts file on computes nodes when a cluster is deployed in subnets without internet access.
  • Fix compute nodes bootstrap by waiting for ephemeral drives initialization before joining the cluster.

AWS ParallelCluster v3.1.1

10 Feb 19:01
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.1.1

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Add support for multiple users cluster environments by integrating with Active Directory (AD) domains managed via AWS Directory Service.
  • Enable cluster creation in subnets with no internet access.
  • Add abbreviated flags for cluster-name (-n), region (-r), image-id (-i) and cluster-configuration / image-configuration (-c) to the cli.
  • Add support for multiple compute resources with same instance type per queue.
  • Add support for UseEc2Hostnames in the cluster configuration file. When set to true, use EC2 default hostnames (e.g. ip-1-2-3-4) for compute nodes.
  • Add support for GPU scheduling with Slurm on ARM instances with NVIDIA cards. Install NVIDIA drivers and CUDA library for ARM.
  • Add parallelcluster:compute-resource-name tag to LaunchTemplates used by compute nodes.
  • Add support for NEW_CHANGED_DELETED as value of FSx for Lustre AutoImportPolicy option.
  • Explicitly set cloud-init datasource to be EC2. This save boot time for Ubuntu and CentOS platforms.
  • Improve Security Groups created within the cluster to allow inbound connections from custom security groups when SecurityGroups parameter is specified for head node and/or queues.
  • Build Slurm with slurmrestd support.

CHANGES

  • Upgrade Slurm to version 21.08.5.
  • Upgrade NICE DCV to version 2021.3-11591.
  • Upgrade NVIDIA driver to version 470.103.01.
  • Upgrade CUDA library to version 11.4.4.
  • Upgrade NVIDIA Fabric manager to version 470.103.01.
  • Upgrade Intel MPI Library to 2021.4.0.441.
  • Upgrade PMIx to version 3.2.3.
  • Disable package update at instance launch time on Amazon Linux 2.
  • Enable possibility to suppress SlurmQueues and ComputeResources length validators.
  • Use compute resource name rather than instance type in compute fleet Launch Template name.
  • Disable EC2 ImageBuilder enhanced image metadata when building ParallelCluster custom images.
  • Remove dumping of failed compute nodes to /home/logs/compute. Compute nodes log files are available in CloudWatch
    and in EC2 console logs.

BUG FIXES

  • Redirect stderr and stdout to CLI log file to prevent unwanted text to pollute the pcluster CLI output.
  • Fix exporting of cluster logs when there is no prefix specified, previously exported to a None prefix.
  • Fix rollback not being performed in case of cluster update failure.
  • Do not configure GPUs in Slurm when NVIDIA driver is not installed.
  • Fix ecs:ListContainerInstances permission in BatchUserRole.
  • Fix RootVolume schema for the HeadNode by raising an error if unsupported KmsKeyId is specified.
  • Fix EfaSecurityGroupValidator. Previously, it may produce false failures when custom security groups were provided and EFA was enabled.
  • Fix FSx metrics not displayed in Cloudwatch Dashboard.

AWS ParallelCluster v3.0.3

17 Jan 13:49
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.0.3

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

CHANGES

  • Disable log4j-cve-2021-44228-hotpatch service on Amazon Linux to avoid incurring in potential performance degradation.

AWS ParallelCluster v2.11.4

20 Dec 17:02
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 2.11.4

Upgrade

How to upgrade?

sudo pip install aws-parallelcluster==2.11.4

CHANGES

  • CentOS 8 is no longer supported (EOL on December 31st, 2021).
  • Upgrade Slurm to version 20.11.8.
  • Upgrade Cinc Client to version 17.2.29.
  • Upgrade NICE DCV to version 2021.2-11190.
  • Upgrade NVIDIA driver to version 470.82.01.
  • Upgrade CUDA library to version 11.4.3.
  • Upgrade NVIDIA Fabric manager to 470.82.01.
  • Disable packages update at instance launch time on Amazon Linux 2.
  • Disable unattended packages update on Ubuntu.
  • Install Python 3 version of aws-cfn-bootstrap scripts on CentOS 7 and Ubuntu 18.04, aligning with Ubuntu 20.04 and Amazon Linux 2.

BUG FIXES

  • Disable update of ec2_iam_role parameter.
  • Fix CpuOptions configuration in LaunchTemplate for t2 instances.

AWS ParallelCluster v3.0.2

05 Nov 18:24
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.0.2

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

3.0.2

CHANGES

  • Upgrade EFA installer to version 1.14.1. Thereafter, EFA enables GDR support by default on supported instance type(s).
    ParallelCluster does not reinstall EFA during node start. Previously, EFA was reinstalled if GdrSupport had been
    turned on in the configuration file. The GdrSupport parameter has no effect and should no longer be used.
    • EFA configuration: efa-config-1.9-1
    • EFA profile: efa-profile-1.5-1
    • EFA kernel module: efa-1.14.2
    • RDMA core: rdma-core-37.0
    • Libfabric: libfabric-1.13.2
    • Open MPI: openmpi40-aws-4.1.1-2

BUG FIXES

  • Fix issue that is preventing cluster names to start with parallelcluster- prefix.

AWS ParallelCluster v2.11.3

03 Nov 17:56
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 2.11.3

Upgrade

How to upgrade?

sudo pip3 install "aws-parallelcluster<3.0" --upgrade --user

2.11.3

CHANGES

  • Upgrade EFA installer to version 1.14.1. Thereafter, EFA enables GDR support by default on supported instance type(s). ParallelCluster does not reinstall EFA during node start. Previously, EFA was reinstalled if enable_efa_gdr had been
    turned on in the configuration file.
    • EFA configuration: efa-config-1.9-1
    • EFA profile: efa-profile-1.5-1
    • EFA kernel module: efa-1.14.2
    • RDMA core: rdma-core-37.0
    • Libfabric: libfabric-1.13.2
    • Open MPI: openmpi40-aws-4.1.1-2
  • Include tags from cluster configuration file in the RunInstances dry runs performed during configuration validation.

BUG FIXES

  • Fix the create custom AMI functionality issues:
    • SGE download URL no more reachable. Use Debian repository to download SGE source archive.
    • Outdated CA certificates used by Cinc. Update ca-certificates package during AMI build time.
  • Fix cluster update when using proxy setup.

AWS ParallelCluster v3.0.1

27 Oct 14:23
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.0.1

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

3.0.1

ENHANCEMENTS

  • Add pcluster3-config-converter CLI command to convert cluster configuration from ParallelCluster 2 to ParallelCluster 3 version.
  • The region parameter is now retrieved from the provider chain, thus supporting the use of profiles and defaults specified in the ~/.aws/config file.
  • Export ParallelClusterApiInvokeUrl and ParallelClusterApiUserRole in CloudFormation API Stack so they can be used by cross-stack references.

CHANGES

  • Drop support for SysVinit. Only Systemd is supported.
  • Include tags from cluster configuration file in the RunInstances dry runs performed during configuration validation.
  • Allow '*' character in the configuration of S3Access/BucketName.

BUG FIXES

  • Pin to the transitive dependencies resulting from the dependency on connexion.
  • Fix cleanup of ECR resources when API infrastructure template is deleted.
  • Fix supervisord service not enabled on Ubuntu. This was causing supervisord not to be started on instance reboot.
  • Update ca-certificates package during AMI build time and have Cinc use the updated CA certificates bundle.
  • Close stderr before exiting from pcluster CLI commands to avoid BrokenPipeError for processes that close the other end of the stdout pipe.

AWS ParallelCluster v3.0.0

10 Sep 15:51
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.0.0

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

3.0.0

ENHANCEMENTS

  • Add support for pcluster actions (e.g., create-cluster, update-cluster, delete-cluster) through HTTP endpoints
    with Amazon API Gateway.
  • Revamp custom AMI creation and management by leveraging EC2 Image Builder. This also includes the implementation of
    build-image, delete-image, describe-image and list-image commands to manage custom ParallelCluster images.
  • Add list-official-images command to describe ParallelCluster official AMIs.
  • Add export-cluster-logs, list-cluster-logs and get-cluster-log-events commands to retrieve both CloudWatch Logs
    and CloudFormation Stack Events. Add export-image-logs, list-image-logs and get-image-log-events commands to
    retrieve both Image Builder Logs and CloudFormation Stack Events.
  • Enable the possibility to restart / reboot the head node also for instance types with
    instance store.
    Those operations remain anyway managed by the user that is responsible for the status of the cluster while operating
    on the head node, e.g. stopping the compute fleet first.
  • Add support to use an existing Private Route53 Hosted Zone when using Slurm as scheduler.
  • Add the possibility to configure the instance profile as alternative to configuring the IAM role for the head and for
    each compute queue.
  • Add the possibility to configure IAM role, profile and policies for head node and for each compute queue.
  • Add possibility to configure different security groups for each queue.
  • Allow full control on the name of CloudFormation stacks created by ParallelCluster by removing the parallelcluster-
    prefix.
  • Add multiple queues and compute resources support for pcluster configure when the scheduler is Slurm.
  • Add prompt for availability zone in pcluster configure automated subnets creation.
  • Add configuration HeadNode / Imds / Secured to enable/disable restricted access to Instance Metadata Service (IMDS).
  • Implement scaling protection mechanism with Slurm scheduler: compute fleet is automatically set to 'PROTECTED'
    state in case recurrent failures are encountered when provisioning nodes.
  • Add --suppress-validators and --validation-failure-level parameters to create and update commands.
  • Add support for associating an existing Elastic IP to the head node.
  • Extend limits for supported number of Slurm queues (10) and compute resources (5).
  • Encrypt root EBS volumes and shared EBS volumes by default. Note that if the scheduler is AWS Batch, the root volumes
    of the compute nodes cannot be encrypted by ParallelCluster.

CHANGES

  • Upgrade EFA installer to version 1.13.0
    • EFA configuration: efa-config-1.9
    • EFA profile: efa-profile-1.5
    • EFA kernel module: efa-1.13.0
    • RDMA core: rdma-core-35
    • Libfabric: libfabric-1.13.0
    • Open MPI: openmpi40-aws-4.1.1-2
  • Upgrade NICE DCV to version 2021.1-10851.
  • Upgrade Slurm to version 20.11.8.
  • Upgrade NVIDIA driver to version 470.57.02.
  • Upgrade CUDA library to version 11.4.0.
  • Upgrade Cinc Client to version 17.2.29.
  • Upgrade Python runtime used by Lambda functions in AWS Batch integration to python3.8.
  • Remove support for SGE and Torque schedulers.
  • Remove support for CentOS8.
  • Change format and syntax of the configuration file to be used to create the cluster, from ini to YAML. A cluster configuration
    file now only includes the definition of a single cluster.
  • Remove --cluster-template, --extra-parameters and --tags parameters for the create command.
  • Remove --cluster-template, --extra-parameters, --reset-desired and --yes parameters for the update command.
  • Remove --config parameter for delete, status, start, stop, instances and list commands.
  • Remove possibility to specify aliases for ssh command in the configuration file.
  • Distribute AWS Batch commands: awsbhosts, awsbkill, awsbout, awsbqueues, awsbstat and awsbsub as a
    separate aws-parallelcluster-awsbatch-cli PyPI package.
  • Add timestamp suffix to CloudWatch Log Group name created for the cluster.
  • Remove pcluster-config CLI utility.
  • Remove amis.txt file.
  • Remove additional EBS volume attached to the head node by default.
  • Change NICE DCV session storage path to /home/{UserName}.
  • Create a single ParallelCluster S3 bucket for each AWS region rather than for each cluster.
  • Adopt inclusive language
    • Rename MasterServer to HeadNode in CLI outputs.
    • Rename variable exported in the AWS Batch job environment from MASTER_IP to PCLUSTER_HEAD_NODE_IP.
    • Rename all CFN outputs from Master* to HeadNode*.
    • Rename NodeType and tags from Master to HeadNode.
  • Rename tags (Note: the following tags are crucial for ParallelCluster scaling logic):
    • aws-parallelcluster-node-type -> parallelcluster:node-type
    • ClusterName -> parallelcluster:cluster-name
    • aws-parallelcluster-attributes -> parallelcluster:attributes
    • Version -> parallelcluster:version
  • Remove tag: Application.
  • Remove runtime creation method
    of custom ParallelCluster AMIs.
  • Retain CloudWatch logs on cluster deletion by default. If you want to delete the logs during cluster deletion, set
    Monitoring / Logs / CloudWatch / RetainOnDeletion to False in the configuration file.
  • Remove instance store software encryption option (encrypted_ephemeral) and rely on default hardware encryption provided
    by NVMe instance store volumes.
  • Add tag 'Name' to every shared storage with the value specified in the shared storage name config.
  • Remove installation of MPICH and FFTW packages.
  • Remove Ganglia support.

AWS ParallelCluster v2.11.2

26 Aug 17:02
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 2.11.2

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

2.11.2

CHANGES

  • When using a custom AMI with a preinstalled EFA package, no actions are taken at node bootstrap time in case GPUDirect RDMA is enabled. The original EFA package deployment is preserved as during the createami process.
  • Upgrade EFA installer to version 1.13.0
    • Update rdma-core to v35.0.
    • Update libfabric to v1.13.0amzn1.0.

BUG FIXES

  • Lock the version of nvidia-fabricmanager package to the installed NVIDIA drivers to prevent updates and misalignments.
  • Slurm: fix issue that prevented powering-up nodes to be correctly reset after a stop and start of the cluster.

AWS ParallelCluster v2.11.1

23 Jul 23:52
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 2.11.1

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

CHANGES

  • Restore noatime option, which has positive impact on the performances of NFS filesystem.
  • Upgrade EFA installer to version 1.12.3
    • EFA configuration: efa-config-1.9 (from efa-config-1.8-1)
    • EFA kernel module: efa-1.13.0 (from efa-1.12.3)

BUG FIXES

  • Pin to version 1.247347 of the CloudWatch agent due to performance impact of latest CW agent version 1.247348.
  • Avoid failures when building SGE using instance type with vCPU >=32.