Skip to content

Latest commit

 

History

History
70 lines (51 loc) · 3.82 KB

README.md

File metadata and controls

70 lines (51 loc) · 3.82 KB

ISC Demos

Welcome to the Strong Compute Instant Super Computer (ISC) Demos repo. Before diving into these demos, it is recommended that Strong Compute users complete the Getting Started section of the Developer Docs.

Demos

The following examples demonstrate use of the ISC for training a variety of models, including how to implement interruptibility in distributed training scripts using checkpointing, atomic saving, and stateful samplers.

These examples are being actively developed to achieve [1] interruptibility in distributed training, [2] verified completion of a full training run, and [3] achievement of benchmark performance published by others (where applicable). Each example published below is annotated with its degree of completion. Examples annotated with [0] are "coming soon".

Hello World

Title Description Model Status Link
Fashion MNIST Image classification CNN [3] isc-demos/fashion_mnist
CIFAR100 Image classification ResNet50 [2] isc-demos/cifar100-resnet50
Distributed Model Parallel TBC TBC [0]

pytorch-image-models (timm)

(from https://github.com/huggingface/pytorch-image-models)

Title Description Model Status Link
resnet50 Image classification ResNet50 [3] isc-demos/pytorch-image-models
resnet152 Image classification ResNet152 [2] isc-demos/pytorch-image-models
efficientnet_b0 Image classification EfficientNet B0 [2] isc-demos/pytorch-image-models
efficientnet_b7 Image classification EfficientNet B7 [2] isc-demos/pytorch-image-models
efficientnetv2_s Image classification EfficientNetV2 S [2] isc-demos/pytorch-image-models
efficientnetv2_xl Image classification EfficientNetV2 XL [2] isc-demos/pytorch-image-models
vit_base_patch16_224 Image classification VIT Base Patch16 224 [2] isc-demos/pytorch-image-models
vit_large_patch16_224 Image classification VIT Large Patch16 224 [2] isc-demos/pytorch-image-models

Torchvision segmentation

(from https://github.com/pytorch/vision/tree/main/references/segmentation)

Title Description Model Status Link
fcn_resnet101 Image segmentation ResNet101 [2] isc-demos/tv-segmentation
deeplabv3_mobilenet_v3_large Image segmentation MobileNetV3 Large [2] isc-demos/tv-segmentation

Torchvision detection

(from https://github.com/pytorch/vision/tree/main/references/detection)

Title Description Model Status Link
maskrcnn_resnet101_fpn Object detection Mask RCNN (ResNet101 FPN) [2] isc-demos/tv-detection
retinanet_resnet101_fpn Object detection RetinaNet (ResNet101 FPN) [2] isc-demos/tv-detection

Detectron2

(from https://github.com/facebookresearch/detectron2)

Title Description Model Status Link
detectron2 TBC Detectron2 [2] isc-demos/detectron2
detectron2_densepose TBC Detectron2 [2] isc-demos/detectron2/projects/densepose

Large Language Models (LLM)

Title Description Model Status Link
Llama2 LoRA Llama2 [0] isc-demos/llama2
Mistral TBC Mistral [0] isc-demos/mistral