You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The notebook references http://opus.nlpl.eu/download.php?f=OpenSubtitles/v2018/mono/OpenSubtitles.it.gz as the source, when I visit the linked opus.nlpl.eu page I see this grid with a bunch of LANG.xml.gz files - I cannot seem to locate a different file than Italian - can you link me to the exact page where I can find alternatives to Italian language so that I can train the model with a different data source please?
The text was updated successfully, but these errors were encountered:
https://opus.nlpl.eu/OpenSubtitles-v2018.php is the page with all the conversational dataset provided by OpenSubtitles.
Look for the first row in the second table, corresponding to the monolingual plain text files (tokenized).
The notebook references
http://opus.nlpl.eu/download.php?f=OpenSubtitles/v2018/mono/OpenSubtitles.it.gz
as the source, when I visit the linked opus.nlpl.eu page I see this grid with a bunch of LANG.xml.gz files - I cannot seem to locate a different file than Italian - can you link me to the exact page where I can find alternatives to Italian language so that I can train the model with a different data source please?The text was updated successfully, but these errors were encountered: