You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks @lkeab and ETH's vision team for sourcing this project !!!
From my understanding, your method (the HQ token and its integration) is lightweight and was trained at low computational cost using the HQSeg44k dataset, which is very diverse with only common feature the high quality of the labels. As a result, SAM-HQ is a purely better version of SAM, with same or better generalization and a better segmentation quality.
Your method could also be used to fine-tune SAM on very specific tasks. But is it well adapted in this context, where we are more interested in this tasks's specific performance than in the generalization or zero-shot performance of the model? Do you have any insights on how to use your method for task-specific fine-tuning?
Will appreciate any comment on this topic! [@lkeab I'm posting on GH because people might be interested in your views on this!]
Thanks,
Simon
The text was updated successfully, but these errors were encountered:
Hi Simon, for task-specific finetuning of SAM, you can allow more model parameters to be trainable based on hq-sam's current training setup, such as adding some lora layers into the mask decoder or image encoder, or just unfreeze the whole mask decoder or more blocks in the image encoder. Also, I would suggest to set hq_token_only = True during the inference after successfully finetuning the model.
I also have another question : in my task, each image is endowed with various masks ; think about items like people on a picture. Each picture can contain between 1 and 10 people. So currently, for each image (image.jpg) I have a binary mask (image.png) gathering all the different masks ; would you recommend splitting this mask into many different masks ?
Hi, thanks @lkeab and ETH's vision team for sourcing this project !!!
From my understanding, your method (the HQ token and its integration) is lightweight and was trained at low computational cost using the HQSeg44k dataset, which is very diverse with only common feature the high quality of the labels. As a result, SAM-HQ is a purely better version of SAM, with same or better generalization and a better segmentation quality.
Your method could also be used to fine-tune SAM on very specific tasks. But is it well adapted in this context, where we are more interested in this tasks's specific performance than in the generalization or zero-shot performance of the model? Do you have any insights on how to use your method for task-specific fine-tuning?
Will appreciate any comment on this topic!
[@lkeab I'm posting on GH because people might be interested in your views on this!]
Thanks,
Simon
The text was updated successfully, but these errors were encountered: