Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize bev_pool_grad_kernel #302

Open
wants to merge 1 commit into
base: dev2.1
Choose a base branch
from

Conversation

GHGmc2
Copy link

@GHGmc2 GHGmc2 commented Oct 25, 2023

We can get ~4x speedup on A00 80GB for the shapes:

out_grad: torch.Size([10, 1, 192, 256, 128]), torch.float32
depth_grad: torch.Size([10, 7, 120, 64, 120]), torch.float32
feat_grad: torch.Size([10, 7, 64, 120, 128]), torch.float32
depth: torch.Size([10, 7, 120, 64, 120]), torch.float32
feat: torch.Size([10, 7, 64, 120, 128]), torch.float32
ranks_depth: torch.Size([28994652]), torch.int32
ranks_feat: torch.Size([28994652]), torch.int32
ranks_bev: torch.Size([28994652]), torch.int32
interval_lengths_bp: torch.Size([537600]), torch.int32
interval_starts_bp: torch.Size([537600]), torch.int32

@rubbish001
Copy link

有没有前向改进的,test的时候太慢了,等修复

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants