Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize MCMC strategy and some tiny fix #3548

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

KevinXu02
Copy link
Contributor

@KevinXu02 KevinXu02 commented Dec 14, 2024

Finalize MCMC strategy #3436, fix bilagird lr rate for splatfacto-big #3383 and some small changes to colmap dataparser (auto iterates possible colmap paths).
Tested on bicycle with random/sfm initialization.
To use: ns-train splatfacto-mcmc

@pablovela5620
Copy link
Contributor

tried to use this but got a torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.47 GiB. GPU error with mcmc, where I had no problems with vanilla splatfacto

@KevinXu02
Copy link
Contributor Author

tried to use this but got a torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.47 GiB. GPU error with mcmc, where I had no problems with vanilla splatfacto

Yes this might happen since MCMC uses a fixed number of gs for the scene. Could you please try add--pipeline.model.max_cap [max num of gs] in your training cmd? And a common number would be 1000000 but you can make it smaller.

@pablovela5620
Copy link
Contributor

tried to use this but got a torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.47 GiB. GPU error with mcmc, where I had no problems with vanilla splatfacto

Yes this might happen since MCMC uses a fixed number of gs for the scene. Could you please try add--pipeline.model.max_cap [max num of gs] in your training cmd? And a common number would be 1000000 but you can make it smaller.

Even setting to 500,000 leads to a OOM error sadly (I'm using a 3060), I had to reduce to 100,000 gaussian. That seems like too few(?) but I honestly don't know. I'll do some more testing

@gradeeterna
Copy link

gradeeterna commented Dec 20, 2024

I'm also getting OOM errors here with --pipeline.model.max-gs-num 1000000 on a 3090 with 24GB VRAM. Same dataset works fine with splatfacto and MCMC+bilagrid directly in gsplat.

),
model=SplatfactoModelConfig(
strategy="mcmc",
mcmc_opacity_reg=0.01,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid redefining these here so that the defaults use these values?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that said we also need to update cull_alpha_threshold to .005 here!

@kerrj
Copy link
Collaborator

kerrj commented Dec 21, 2024

I'm also getting OOM errors here with --pipeline.model.max-gs-num 1000000 on a 3090 with 24GB VRAM. Same dataset works fine with splatfacto and MCMC+bilagrid directly in gsplat.

It's possible there are some memory differences between gsplat and nerfstudio because of dataloader overhead. When you run gsplat is the memory usage very close to 24GB? it may help to make sure the images are cached on CPU inside splatfacto-mcmc (there's a FullImagesDataManager parameter for this). If splatfacto-mcmc is taking significantly more memory than gsplat's version that would be surprising, but I think a difference of ~1GB is expected. MCMC in general will take more memory than the default strategy since the re-sampling step is a little memory hungry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants