Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about '3D joint position y' in Projective Attention #30

Open
Billccx opened this issue Nov 9, 2023 · 2 comments
Open

Question about '3D joint position y' in Projective Attention #30

Billccx opened this issue Nov 9, 2023 · 2 comments

Comments

@Billccx
Copy link

Billccx commented Nov 9, 2023

Hello!

I'm wondering what y refers to in the Projective Attention module.

Is it the ground truth of the 3D keypoints? If so, how do we handle it when the model is inferencing?

I'm looking forward to your reply and thanks in advance ~

image

image

@twangnh
Copy link
Collaborator

twangnh commented Nov 9, 2023

hi it means the 3d joint location prediction of the current decoder layer

@Billccx
Copy link
Author

Billccx commented Nov 9, 2023

hi it means the 3d joint location prediction of the current decoder layer

Thanks for your quick reply !

I have checked the source code. It seems that reference_points is y ?

Does this mean that y is actually the reference_points output by the previous decoder layer?

def forward(self, tgt, reference_points, src_views,
src_views_with_rayembed, meta, src_spatial_shapes,
src_level_start_index, src_valid_ratios,
query_pos=None, src_padding_mask=None):
output = tgt
intermediate = []
intermediate_reference_points = []
for lid, layer in enumerate(self.layers):
reference_points_input = reference_points[:, :, None]
output = layer(output, query_pos, reference_points_input,
src_views, src_views_with_rayembed,
src_spatial_shapes,
src_level_start_index, meta, src_padding_mask)
# hack implementation for iterative pose refinement
if self.pose_embed is not None:
tmp = self.pose_embed[lid](output)
new_reference_points = tmp + inverse_sigmoid(reference_points)
new_reference_points = new_reference_points.sigmoid()
reference_points = new_reference_points.detach()
if self.return_intermediate:
intermediate.append(output)
intermediate_reference_points.append(reference_points)
if self.return_intermediate:
return torch.stack(intermediate), \
torch.stack(intermediate_reference_points)
return output, reference_points

I am not sure if my understanding is correct. If it is not, could you please explain it in detail? Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants