-
Notifications
You must be signed in to change notification settings - Fork 231
Support for more languages #10
Comments
Hi @EmilStenstrom, thanks for your interest. Supporting more languages is WIP and we plan to include that in future versions. |
Let me know if there’s something I can do to help! (Native Swedish speaker) |
Hi @EmilStenstrom we meet again! We are looking into training a Swedish BLINK, but we have noticed there is not much documentation on data preprocessing and training pipelines. Would it be possible for someone to add a step by step guide for training a model for another language? Especially how you go from the wikipedia dumps to training data. @ledw |
I've created a new repository for training bi-encoder models, following this tutorial you can train the model in another language using a correct transformer model using the BLINK code or following this tutorial. |
The link is not available now. Can you update it? Thanks. |
Hi buddy, could you update this tutorial link? it's not available. thanks. |
there's a tutorial on how to train on smaller biencoder model here #116 |
It looks like this architecture would work for non-english languages too. Wikipedia is availiable in more languages, flair has embeddings in other languages, and BERT is available elsewhere.
Is there something stopping this from being applied to eg. Swedish?
The text was updated successfully, but these errors were encountered: