Releases
v0.2.3
Enhancements
Added support for RH OCP4.1 and RH OCP4.2
Added additional installation methods
Added support for Go Modules and removed vendor directories
Added default ephemeral storage for init container
Overwrite NVIDIA env vars to avoid using GPUs on launcher
Added health check and callbacks around various leader election phases
Honor user-specified worker command
Exposed main container name as a configurable field
Added RunPolicy to MPIJobSpec that reuses kubeflow/common spec
Allow to specify the name of the gang scheduler and priority for pod group
Added error log when pod spec does not have any containers
Switched to use distroless images
Refactored the kubectl-delivery to improve the launcher performance
Added Prometheus metrics for job monitoring
Added experimental version of v1 MPIJob controller and APIs
Support Volcano as a scheduler
Switched to use pods for launcher job and statefulset workers
Switched to use klog for logging
More consistent labels with other Kubeflow operators
Fixes
Fixed nil pointer exceptions that could accidentally restart the pod
Updated status to running only when launcher is active and all workers are ready
Fixed the incorrect namespace for initializing informers and endpoints of leader election
Fixed issue in v1 controller's CRD existence check
Documentation
You can’t perform that action at this time.