Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
embedding_trainer.ipynb		embedding_trainer.ipynb

README.md

Word Embedding

This folder contains examples and best practices, written in Jupyter notebooks, for training word embedding on custom data from scratch.
There are three typical ways for training word embedding: Word2Vec, GloVe, and fastText. All of the three methods provide pretrained models (pretrained model with Word2Vec, pretrained model with Glove, pretrained model with fastText).
These pretrained models are trained with general corpus like Wikipedia data, Common Crawl data, etc., and may not serve well for situations where you have a domain-specific language learning problem or there is no pretrained model for the language you need to work with. In this folder, we provide examples of how to apply each of the three methods to train your own word embeddings.

What is Word Embedding?

Word embedding is a technique to map words or phrases from a vocabulary to vectors or real numbers. The learned vector representations of words capture syntactic and semantic word relationships and therefore can be very useful for tasks like sentence similary, text classifcation, etc.

Summary

Notebook	Environment	Description	Dataset	Language
Developing Word Embeddings	Local	A notebook shows how to learn word representation with Word2Vec, fastText and Glove	STS Benchmark dataset	en

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

embeddings

embeddings

README.md

Word Embedding

What is Word Embedding?

Summary

Files

embeddings

Directory actions

More options

Directory actions

More options

Latest commit

History

embeddings

Folders and files

parent directory

README.md

Word Embedding

What is Word Embedding?

Summary