Optimize bev_pool_grad_kernel #302

GHGmc2 · 2023-10-25T11:50:33Z

We can get ~4x speedup on A00 80GB for the shapes:

out_grad: torch.Size([10, 1, 192, 256, 128]), torch.float32
depth_grad: torch.Size([10, 7, 120, 64, 120]), torch.float32
feat_grad: torch.Size([10, 7, 64, 120, 128]), torch.float32
depth: torch.Size([10, 7, 120, 64, 120]), torch.float32
feat: torch.Size([10, 7, 64, 120, 128]), torch.float32
ranks_depth: torch.Size([28994652]), torch.int32
ranks_feat: torch.Size([28994652]), torch.int32
ranks_bev: torch.Size([28994652]), torch.int32
interval_lengths_bp: torch.Size([537600]), torch.int32
interval_starts_bp: torch.Size([537600]), torch.int32

rubbish001 · 2024-07-04T09:23:17Z

有没有前向改进的，test的时候太慢了，等修复

Optimize bev_pool_grad_kernel

a8ec61f

GHGmc2 force-pushed the dev2.1 branch from 81b080f to a8ec61f Compare October 25, 2023 11:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize bev_pool_grad_kernel #302

Optimize bev_pool_grad_kernel #302

GHGmc2 commented Oct 25, 2023 •

edited

Loading

rubbish001 commented Jul 4, 2024

Optimize bev_pool_grad_kernel #302

Are you sure you want to change the base?

Optimize bev_pool_grad_kernel #302

Conversation

GHGmc2 commented Oct 25, 2023 • edited Loading

rubbish001 commented Jul 4, 2024

GHGmc2 commented Oct 25, 2023 •

edited

Loading