-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filtering words composed of more than 1 token #4
Comments
I think one option would be to compute the probability of multiple tokens being generated and use that the same way the single token probability is being used. Let's say there is a word that splits into two tokens s1, s2: Instead of p(w|x) in equation 5, you could potentially replace this by p(s1|x)*p(s2|s1,x), and I suspect that should work with everything else as is. I haven't tested this, if you have any luck with this, let us know. Alternatively, I plan on testing it at some point soon and can get back (will update the code appropriately). |
i try to run this code, and all words are composed of more than one token, |
Hi, |
Hi, |
Hi, thanks for the great works.
I see that you are filtering out words that are composed of more than one token:
PPLM/run_pplm.py
Line 390 in 5f27e19
Do you have any idea how to deal with this when we want to use these multi token words?
Cheers.
The text was updated successfully, but these errors were encountered: