From f975a6877e83875c6b62693bd6a26bc82c9b5f9a Mon Sep 17 00:00:00 2001 From: Zhensheng Yuan <564361+yzslab@users.noreply.github.com> Date: Fri, 12 Jul 2024 13:23:44 +0800 Subject: [PATCH] Update README.md: add new multiple GPU training --- README.md | 35 +++++++++++++++++++++++++++++++---- 1 file changed, 31 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 664bac46..eabd50db 100644 --- a/README.md +++ b/README.md @@ -4,9 +4,9 @@ * Web Viewer * Changelog ## Known issues -* Multi-GPU training can only be enabled after densification +* ~~Multi-GPU training can only be enabled after densification~~ (Try 2.16. New Multiple GPU training strategy) ## Features -* Multi-GPU/Node training (only after densification) +* Multi-GPU/Node training * Switch between diff-gaussian-rasterization and nerfstudio-project/gsplat * Multiple dataset types support * Blender (nerf_synthetic) @@ -167,8 +167,10 @@ python main.py fit \ ... ``` -### 2.4. Multi-GPU training -[NOTE] Multi-GPU training can only be enabled after densification. You can start a single GPU training at the beginning, and save a checkpoint after densification finishing. Then resume from this checkpoint and enable multi-GPU training. +### 2.4. Multi-GPU training (DDP) +[NOTE] Try New Multiple GPU training strategy, which can be enabled during densification. + +[NOTE] Multi-GPU training with DDP strategy can only be enabled after densification. You can start a single GPU training at the beginning, and save a checkpoint after densification finishing. Then resume from this checkpoint and enable multi-GPU training. You will get improved PSNR and SSIM with more GPUs: ![image](https://github.com/yzslab/gaussian-splatting-lightning/assets/564361/06e91e71-5068-46ce-b169-524a069609bf) @@ -488,6 +490,31 @@ python main.py validate \ Then you can find the rendered masks and images in `outputs/brandenburg_gate/val`. +### 2.16. New Multiple GPU training strategy + +#### Introduction +This is a bit like a simplified version of Scaling Up 3DGS. + +In the implementation here, Gaussians are stored, projected and their colors are calculated in a distributed manner, and each GPU rasterizes a whole image for a different camera. No Pixel-wise Distribution currently. + +This strategy works with densification enabled. + +### Usage +* Training +```bash +python main.py fit \ + --config configs/distributed.yaml \ + ... +``` +* Merge checkpoints +```bash +python utils/merge_distributed_ckpts.py outputs/TRAINED_MODEL_DIR +``` +* Start viewer +```bash +python viewer.py outputs/TRAINED_MODEL_DIR/checkpoints/MERGED_CHECKPOINT_FILE +``` + ## 3. Evaluation Per-image metrics will be saved to `TRAINING_OUTPUT/metrics` as a `csv` file.