Upcoming changes to Python library #13
alasdairforsythe
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm planning the following changes to the Python library which will be implemented within the next couple of days:
Tokenize will return numpy array instead of list of ints (decode will accept either).
The functions
convert_ids_to_tokens
,convert_ids_to_tokens_decoded
&convert_tokens_to_ids
will be removed. I originally intended them for compatibility with Hugging Face Transformers, but it's clear now that was the wrong approach as they cannot be used in the same context as the similarly named functions in Transformer's tokenizer classes. Usetoken_to_id
andid_to_token
instead, or preferablydecode
.If you have any questions, concerns or suggestions on this please reply here.
Beta Was this translation helpful? Give feedback.
All reactions