Why average loss value by `batch_size` when using `batch_hard` method? #55

Hongtao-Niro · 2020-01-28T03:50:09Z

Hi Olivier,

Thanks for your work, it helped me a lot! Just one question:

In the batch_hard_triplet_loss method, you calculate the average loss value using tf.reduce_mean, which is equivalent to tf.reduce_sum(tensor) / len(tensor). As a result the losses are averaged over batch_size. Shouldn't the output loss value be the average over number of non-zero losses? For example, let's say triplet_loss at line 218 is [1.4, 0, 0, 0 1.6], the output of the function will be 0.6 according to the code, but shouldn't it be 1.5?

I saw a similar discussion under your triplet loss post, but the code doesn't seem to reflect the thought.

The text was updated successfully, but these errors were encountered:

Hongtao-Niro changed the title ~~Why average loss value by batch size when using batch_hard method?~~ Why average loss value by batch_size when using batch_hard method? Jan 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why average loss value by `batch_size` when using `batch_hard` method? #55

Why average loss value by `batch_size` when using `batch_hard` method? #55

Hongtao-Niro commented Jan 28, 2020 •

edited

Loading

Why average loss value by batch_size when using batch_hard method? #55

Why average loss value by batch_size when using batch_hard method? #55

Comments

Hongtao-Niro commented Jan 28, 2020 • edited Loading

Why average loss value by `batch_size` when using `batch_hard` method? #55

Why average loss value by `batch_size` when using `batch_hard` method? #55

Hongtao-Niro commented Jan 28, 2020 •

edited

Loading