Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(kfp): add train stage #20

Merged
merged 6 commits into from
Sep 13, 2024
Merged

Commits on Sep 12, 2024

  1. feat(kfp): add train stage

    Signed-off-by: Tomas Coufal <[email protected]>
    tumido committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    67bbfae View commit details
    Browse the repository at this point in the history
  2. fix: add -p to mkdir

    Signed-off-by: Tomas Coufal <[email protected]>
    tumido committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    c5dcacc View commit details
    Browse the repository at this point in the history
  3. fix: add spec.nprocPerNode to PyTorchJob

    Signed-off-by: Tomas Coufal <[email protected]>
    tumido committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    e3a7e6a View commit details
    Browse the repository at this point in the history
  4. fix: env variables for the training container to make it variables te…

    …mplate properly
    
    Signed-off-by: Tomas Coufal <[email protected]>
    tumido committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    c579f13 View commit details
    Browse the repository at this point in the history

Commits on Sep 13, 2024

  1. fix: enforce single node PyTorchJob for now

    Signed-off-by: Tomas Coufal <[email protected]>
    tumido committed Sep 13, 2024
    Configuration menu
    Copy the full SHA
    e7e730b View commit details
    Browse the repository at this point in the history
  2. fix: remove memory limit on PyTorchJob for testing purposes

    Signed-off-by: Tomas Coufal <[email protected]>
    tumido committed Sep 13, 2024
    Configuration menu
    Copy the full SHA
    38cd301 View commit details
    Browse the repository at this point in the history