You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 1, 2024. It is now read-only.
Hi, thanks for your code and paper. I am a fresher in EL and I have a question:
bi-encoder and cross encoder are optimized jointly or separately? Specifically, loss function Eq. 4, loss function in the paragraph following Eq. 6, and loss function in Eq. 10, what is the relationship ?
The text was updated successfully, but these errors were encountered:
@lshowway Hi, sorry for the delay in responding. The bi-encoder and cross encoder are optimized separately. For those equations, they are all independent. To be more specific: 1. Use eq.4 to train a bi-encoder. 2. Use eq.6 to train a cross-encoder. 3. Use eq.10 to train a distillation model (teacher: cross encoder, student bi-encoder). I hope that answers your question!
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi, thanks for your code and paper. I am a fresher in EL and I have a question:
bi-encoder and cross encoder are optimized jointly or separately? Specifically, loss function Eq. 4, loss function in the paragraph following Eq. 6, and loss function in Eq. 10, what is the relationship ?
The text was updated successfully, but these errors were encountered: